EAP File Verification

Skip to end of metadata
Go to start of metadata
One line summary When media are detected, the tool will identify the selected format and identify valid / invalid / broken files
Detailed description Solution for EAP Issue 1 Broken TIFFs
Same tool as for Solution 3


Developed using PHP and FITS (and a bit of JQuery/UI for results page).

1. Scan specified directory for files
2. If filetype is of interest (in the demo, TIFFs), process file.
3. If file is good (a valid TIFF), extract technical metadata
4. If technical metadata is of a sufficient standard, file is good!
5. A progress file is written to disk after each file is checked to show how things are progressing.
Return list of Bad files, Substandard files, Good files and Unprocessed files.

Solution would hopefully be able to run when new media is detected (AutoRun).

Files detected
Check if valid against type

Workflow diagram (see attached)




Solution champion John Salter, Matt Ruane
Git link  
Evaluation User friendly interface useful for Content Owners and Project Holders - and can be included in Project Holder workflow, hopefully catching problems early on so that digitisation can be re-done if necessary.
Information provided by the tool will also help us to narrow down further QA activity.
Will need to consult with Digital Preservation Team and IT before implementing.
There is possibly further work to be done on the tool - so that it can be used on other file types (audio and video), generating thumbnail views of "Good" files
Could possibly tie in with the tool to ID compressed TIFFs and convert them to uncompressed?

A re-coded interface, that doesn't require any additional software or admin rights to run would be a good next step.  This would increase the availability and ease of use of the software to users.

It partially works but needs further development.
Tool (link)  
Issue EAP Issue 1 Broken TIFF images
Name Size Creator Creation Date Comment  
PDF File eap.pdf 20 kB Matthew Ruane Jun 15, 2011 12:26  
Labels:
image image Delete
tiff tiff Delete
validation validation Delete
identification identification Delete
bit_rot_detection bit_rot_detection Delete
solution solution Delete
characterisation characterisation Delete
quality_assurance quality_assurance Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.