Skip to end of metadata
Go to start of metadata
State and University Library Denmark - Web Archive Data
Description 220 TB of web archive content in ARC format
Licensing This is a closed archive only accessible by danish researchers
Owner SB
Dataset Location Currently not available
Collection expert Bjarne Andersen (SB)
Issues brainstorm
  • At some point within the next 2 years we will need to migrate the content from ARC to WARC. A crucial step in this migration is automatic QA to ensure that the migrated container has exactly the same content as the original. This is very important since we dont have budget to keep the original ARC-files. Several institutions have already done this (e.g. BL) - so tools most likely exist
  • A general characterisation of web content is also needed to even begin talking about preservation of this kind of material.
List of Issues IS12 ARC to WARC migration,IS25 Web Content Characterisation
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.