Title |
Govdocs1 Open Corpus |
Description | A corpus of 1 million documents that are freely available for research, drawn from US government web sites, of various formats. |
Licensing | None. Free to used and distribute. |
Owner | N/A |
Dataset Location |
http://digitalcorpora.org/corpora/files![]() |
Collection expert | N/A |
Issues brainstorm | Should act as a representative corpus for web archive testing. |
List of Issues | A list of links to detailed Issue pages relevant to this Dataset |
Labels:
None