Detailed description A detailed description of the Issue. The Issue MUST focus on the business or preservation driven challenge, and should not assume or describe a particular solution.
There has been post-scan processing of the original Tiff masters.  (For example cropping and conversion to JPEG2000.)
Some of the processed images (both cropped tiff's and jpeg2000's) have artefacts that appear to be a result of the cropping process.  The artefacts look like rows of pixels that have been shifted across the image.  For example, page edges or gutter that appear within the newspaper page.

The issue is to be able to identify which images in a large collection have this problem. For some images, when they are rendered it is obvious that they contain errors, but for others the errors are so subtle that zooming-in and close inspection is necessary to spot them.  A tool to identify the faulty images is required.
It should not be assumed that the original Tiff masters are still available, although for the sample set they currently are.
It is not currently an aim to fix the images.
Issue champion Lynne Chivers
Gerben van der Meulen, International Institute of Social History
  • Perform OCR on image - may fail on errors in images - use this to identify possible artefacts.  Could be too computer-intensive, though.
  • Search for areas of image which move suddenly from very black to very white rather than a gradation.  May give false positives, but if reduces visual search effort from millions of files to thousands could be valuable.
Datasets BL 19th Century digitised newspaper collection
Identify Shifted Crop Issue in JPEG2000
Corrupted JPEG and JPEG2000 files solution

