  • identified at ingest, a more proper identification with a better tool would be preferable. The idea behind this is to get an idea of certain risks (intelligent reporting
  • NARA File Analyzer and Metadata Harvester Summary Purpose NARA File Analyzer and Metadata Harvester allows a user to analyze the contents of a file system or external drive and generates statistics about the contents of the contained directories. Homepage Source Code
  • IS7 Incompleteness and and inconsistency of web archive data Title Incompleteness and/or inconsistency of web archive data Detailed description The best practice in preserving websites is by crawling them using a web crawler like Heritrix. However, crawling is a process that is highly susceptible to errors. Often, esse
  • UK Web Domain Dataset Format Profile Title British Library UK Web Domain Dataset: Format Profile Description MIME type records have been created for the UK Web Domain Dataset, using three sources/tools: the MIME types delivered by the server Apache Tika DROID All three MIME types are collected, along with the year the
  • during identification Include formats during identification Exclude formats during identification Defining read buffers The definition of a read buffer in FIDO … GitHub issue page Increasing buffer size might or might not slow down the identification process. Increasing
  • EAP File Verification One line summary When media are detected, the tool will identify the selected format and identify valid / invalid / broken files Detailed description Solution for EAP Issue 1 Broken TIFFs Same tool as for Solution 3 Developed using PHP and FITS (and a bit of JQuery/UI for results page). 1. Scan sp
  • IS8 Diversity of office document formats in digital objects archive Title Diversity of office document formats in digital objects archive Detailed description Document instances of many different file formats are referenced in web content. Many of these formats might not be renderable in a web archive viewer in the fut
  • to identify the file format and some file properties. This means that it creates an XML description of the identification result which is based on a set of identification tools that FITS uses (FITS wraps e.g. Droid, Jhove 1 amoung others and normalizes the characterisation output). The „ReadTextFile“ component reads
  • of issues Identification and validation of multiple, esoteric and proprietary file formats. Identification of file corruption. Extraction of technical metadata … to equalresolution WAV files for archival. AQuA:Identification and validation of esoteric audio file formats AQuA:Normalization of digital audio files
  • Scenario E Identification, integrity and consistency checking AQuA:Scenario E Identification, integrity and consistency checking Mette Andy Scenario F Detecting