Title |
Govdocs1 Open Corpus |
Description | A corpus of 1 million documents that are freely available for research, drawn from US government web sites, of various formats. |
Licensing | None. Free to used and distribute. |
Owner | N/A |
Dataset Location | http://digitalcorpora.org/corpora/files![]() |
Collection expert | N/A |
* This dataset contains 231,683 PDFs which total 127.8GB
Labels:
None