This page documents the results from the AQuA Project Mashups in Leeds and London. It describes digital Collections, preservation or QA Issues associated with the Collections, Contexts to the Collections and Issues, and Solutions developed to solve the Issues. The result is a network of information, illustrated diagramatically by this example.
These are the Collections brought along to the AQuA Mashups by curators and digital preservation practitioners.
- Audio Collections
- Image Collections
- 19th Century Books (BL)
- BOPCRIS
- Brightsolid digitisation of British Library newspapers
- Digitised Books (ONB, Google Books)
- Digitised Books (ONB)
- East London Theatre Archive
- Historic photographic collection
- JISC1 19th Century Digitised Newspapers (BL)
- Mass digitisation of images (York)
- User generated content (images)
- Wellcome Library digitisation
- MS Word Collections
- Multimedia Collections
- PDF Collections
These are actual or potential preservation or QA Issues relating to the Collections described above. Note that an Issue can be relevant to more than one Collection and an Issue may have more than one Solution.
- Issue - Solution evaluation questions
- Audio Issues
- Image Issues
- Audit images against criteria
- BOPCRIS issue - Mix of compressed and uncompressed TIFFS
- De-duplication of multiple scanned images of same object
- Duplicate images within a collection or job
- EAP Issue 1 Broken TIFF images
- EAP Issue 2 TIFF images that will not to open in Photoshop or Adobe Bridge
- EAP Issue 4 Detecting Visual Errors
- Finding duplicate images
- Historic photographic collection - check for versions of an image
- Historic photographic collection - consistency across time or suppliers
- Identification of same image at different levels of rotation
- Newspaper issue dates
- Quality assurance of a migration from TIFF to JPEG2000
- Quality issues in digitised pages
- Unknown JPEG2000 characteristics presents risks to quality, preservation and access
- Using METS data to inform analysis
- Validating JPEG2000 files on conversion from TIFF, identifying and tracing source of errors
- Metadata Issues
- Born-digital - log file checks
- Born-digital - metadata validation
- EAP Issue 3 Metadata Extraction from audio, video and image files
- EAP Issue 5 Identifying empty folders
- EAP Issue 6 Identify Missing or Out of sequence files
- Extraction of metadata from digital audio files
- Inconsistencies between metadata and content
- Unknown born-digital file history
- MS Office Issues
- Multimedia Issues
- OCR Issues
- PDF Issues
These are the contextual and institutional requirements to the Collections and Issues. They will have implications for the design, embedding and use of the Solutions.
- Brightsolid
- British Library Sound & Vision
- EAP Context
- Externally Created Archive Context
- Image collection context
- London School of Economics
- NANETH
- University of Southampton Library Digitisation Unit Context
- University of York Context
- Wellcome Library
These are Solutions to the Issues outlined above, that have been developed by the participants at the AQuA Mashup events.
- Audio Solutions
- Image Solutions
- EAP File Verification
- Identify compressed TIFFs and convert them to uncompressed TIFFs
- Identifying rotated, duplicate images using pHash
- java image blocks comparison
- jp2 header analysis
- Newspaper issue dates - solution
- Perceptual Image Diff comparison
- ssdeep for duplicate image detection
- tiff2RDF - visualising image collection consistency
- Validating TIFF to JPEG2000 migration
- Metadata Solutions
- MS Office Solutions
- Multimedia Solutions
- OCR Solutions
- PDF Solutions