Shifted Crop Corruption

Skip to end of metadata
Go to start of metadata
Title
Shifted Crop Corruption
Detailed description A detailed description of the Issue. The Issue MUST focus on the business or preservation driven challenge, and should not assume or describe a particular solution.
There has been post-scan processing of the original Tiff masters.  (For example cropping and conversion to JPEG2000.)
Some of the processed images (both cropped tiff's and jpeg2000's) have artefacts that appear to be a result of the cropping process.  The artefacts look like rows of pixels that have been shifted across the image.  For example, page edges or gutter that appear within the newspaper page.

The issue is to be able to identify which images in a large collection have this problem. For some images, when they are rendered it is obvious that they contain errors, but for others the errors are so subtle that zooming-in and close inspection is necessary to spot them.  A tool to identify the faulty images is required.
It should not be assumed that the original Tiff masters are still available, although for the sample set they currently are.
It is not currently an aim to fix the images.
Issue champion Lynne Chivers
Other interested parties
Any other parties who are also interested in applying Issue Solutions to their Datasets
Gerben van der Meulen, International Institute of Social History
Possible Solution approaches Brief brainstorm of possible approaches to solving the Issue. Each approach should be described in a single sentence as part of a bulleted list
  • Perform OCR on image - may fail on errors in images - use this to identify possible artefacts.  Could be too computer-intensive, though.
  • Search for areas of image which move suddenly from very black to very white rather than a gradation.  May give false positives, but if reduces visual search effort from millions of files to thousands could be valuable.
Context Details of the institutional context to the Issue. (May be expanded at a later date)
Lessons Learned Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice
Datasets BL 19th Century digitised newspaper collection
Solutions Reference to the appropriate Solution page(s), by hyperlink
Identify Shifted Crop Issue in JPEG2000
Corrupted JPEG and JPEG2000 files solution

Example issue

Labels:
issue issue Delete
spruce_glasgow spruce_glasgow Delete
spruce spruce Delete
york_hackathon york_hackathon Delete
bit_rot bit_rot Delete
qa qa Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.