The pages below form a network of Datsets, preservation Issues with those Collections, Contextual information and Solutions to the Issues. Quads composed of a Dataset, Issue, Context and a Solution describe an end to end digital preservation challenge. This format was developed to capture the results of hackathon or mashup type events that were held as part of the AQuA Project and has subsequently been developed and used within the SCAPE Project and further hackathon events.
To add a new Dataset, Issue, Solution, or Scenario check out the following instructions: How to add a new Dataset, Issue, Solution or Scenario.
WARNING: There is a Confluence Wiki bug that sometimes causes problems with bulleted lists inside tables. If you're having trouble, don't use a list inside a table!
Last Event: Practical tools for digital preservation - A Hackathon, 27-29 September, York
Datasets:
These are the Datasets that relate to specific preservation Issues which in turn have Solutions developed for them.Click this link to create a new Dataset, then edit the italicised text.
- ADS Grey Literature Library
- BL 19th Century digitised newspaper collection
- Computer game disc images
- Database containing a unique list of Danish words
- eTheses
- French Web Archives
- Ida Roper Herbarium archive
- Imperial College Exploration Board Adventure 2001 Comprising Overland Pakistan and Biafo Climbing Nick Adlam Alain Hosley James Smyth Tim Harris Nick Saunders
- International Institute for Social History
- LAVC audio
- Leeds image duplicates and versions
- Lovebytes Festival Media Archive
- PDF Creator Validator
- Realistic Disk Image Collection for Research and Education
- Script and Programming Language File Identifications
- Sgrin Archive
- Web based emails
Issues:
These are the preservation or other business driven Issues that are found in particular Dataset and may have Solutions developed to solve them. Click this link to create a new Issue, then edit the italicised text.
- Ability to automatically identify script files
- Analyzing a disk image of a 12-year old laptop
- Automatically extracting metadata for Grey Literature reports
- Check content of e-pub against digitized book
- Checking that significant properties are preserved after migration
- Common validation error messages from PDF to PDFA conversion
- Data Extraction from real world Android Phone Images through BW-FLA Emulation as a service
- Decoding JP2 with OpenJPEG goes wrong in case of embedded ICC profiles
- Deduplication
- E-mail Threads - relinking the conversation
- ePub Version 2.0 Validation
- Extracting embedded objects from docx files
- Identifying content and Sorting
- Identifying web content
- Jhove reports error for non-standard violating criterion (imbalanced page trees)
- PDF Creator Validator (Issue)
- PDF to PDF A Conversion
- PDF to PDF-A conversion
- Permission Overlays
- Shifted Crop Corruption
- Sorting Error Messages by Pdf Creation Software
- Sound files, type and quality checking
- Truncated JPEG2000
- Verify if data (a file) is not existing anymore
- Web based email "harvesting"
Contexts:
These are the contextual requirements to a particular Issue and Solution. They will have implications for the design, embedding and use of the Solution. Click this link to create a new Context, then edit the italicised text.
- Archaeology Data Service
- Bibliothèque nationale de France (National Library of France, BnF)
- National Library of Wales
- Workflow "web based email"
Solutions:
These are Solutions that address Issues that relate to particular Datasets. Click this link to create a new Solution, then edit the italicised text.
- Convert embedded fonts to outlines
- Determine the format of a digital object
- Extracting embedded objects from Office OpenXML documents
- Fixes for some common PDF to PDFA conversion validation errors
- Harvest webmail accounts
- Identify Files Affected by Truncated-Fuzzy JPEG2000
- Identify Shifted Crop Issue in JPEG2000
- Mediainfo output viewer
- Open Planets Foundation - File Scanner
- PDF to PDFA Conversion
- PDF to PDF-A Conversion Pre-Processor
- Permissions Overlays
- Search Web Archive Data for Highlighted Text in Chrome
- Server MIME Type Correction
- Use ohcount to detect source code text files
- Validate and report filetypes per file
Tools:
Latest Changes
Recently Updated
-
Digital Preservation and Data Curation Requirements and Solutions
updated by Becky McGuinnessFeb 27, 2015
-
Ida Roper Herbarium archive
updated by Jodie DoubleFeb 16, 2015
-
PDF to PDFA Conversion
created by Matthew BullSep 02, 2014
-
Jhove reports error for non-standard violating criterion (imbalanced page trees)
updated by Michelle LindlarSep 02, 2014
-
Imperial College Exploration Board Adventure 2001 Comprising Overland Pakistan and Biafo Climbing Nick Adlam Alain Hosley James Smyth Tim Harris Nick Saunders
updated by Anne BarrettSep 02, 2014
-
Common validation error messages from PDF to PDFA conversion
updated by Jo GilhamSep 02, 2014
-
Fixes for some common PDF to PDFA conversion validation errors
updated by Jo GilhamSep 02, 2014
-
PDF Creator Validator (Issue)
updated by René MittåSep 02, 2014
-
Sorting Error Messages by Pdf Creation Software
updated by Becky McGuinnessSep 02, 2014
-
PDF Creator Validator
updated by Becky McGuinnessSep 02, 2014
-
Common validation error messages from PDF to PDFA conversion
updated by Becky McGuinnessSep 02, 2014
-
Imperial College Exploration Board Adventure 2001 Comprising Overland Pakistan and Biafo Climbing Nick Adlam Alain Hosley James Smyth Tim Harris Nick Saunders
updated by Becky McGuinnessSep 02, 2014
-
Fixes for some common PDF to PDFA conversion validation errors
updated by Becky McGuinnessSep 02, 2014
-
PDF to PDF A Conversion
created by Anne BarrettSep 02, 2014
-
PDF Creator Validator
created by René MittåSep 02, 2014
- More