View Source

| *Title* \\ | Govdocs1 Open Corpus \\ |
| *Description* | A corpus of 1 million documents that are freely available for research, drawn from US government web sites, of various formats. \\ |
| *Licensing* | None. Free to used and distribute. \\ |
| *Owner* | N/A \\ |
| *Dataset Location* | http://digitalcorpora.org/corpora/files \\ |
| *Collection expert* | N/A |
| *Issues brainstorm* | Should act as a representative corpus for web archive testing. \\ |
| *List of Issues* | _A list of links to detailed Issue pages relevant to this Dataset_ \\
\\
\\ |