Page: BOPCRIS issue - ABBYY "Unknown error"
One line summary ABBYY recognition Server 3 inconsistently gives an "Unknown error" error message when processing collection files.                       &nbsp ...
Page: Check consistency between metadata and content
One line summary Check that the METS, OCR, JPEG2000 masters and the PDFs are consistent \\ Detailed description As shown in the diagram below, check images and ALTO files information defined in METS against the real files stored in separate Zip files. Also ...
Page: Compare OCR results of the same source material in different formats (TIFF, JP2)
One line summary The intention of this solution was to compare two OCR results where the images that are OCRed have two different formats, one is the original TIFF file, the other one is a JP2 (JPEG 2000) representation of this TIFF file. The goal was to find ...
Page: Identifying missed or duplicated pages
Note that this is a blank proforma. Please make a copy of it, before filling out the form\! One line summary Identifying missed or duplicated pages in books, archives and manuscripts \\ Detailed description Multipaged items form the vast bulk of digitisation projects. There is always ...
Page: Newspaper issue dates
One line summary For cataloguing purposes, it is of absolute importance that the issue data metadata is accurate. How can we ensure this? And can we predict where issues may be missing? Detailed description Newspapers are structured by title, by year, and by issue in each ...
Page: OCR Comparison
One line summary Compare two different OCR results. If the results are not sufficiently close, the source pages may be different indicating possible issues. \\ Detailed description See detailed scenario descriptions below. \\ Solution champion Georg Petz & Sven ...
Page: Use of OCR metadata
One line summary How can we use OCR metadata to identify pages for human QC investigation? Detailed description The ABBYY FineReader 9 engine outputs various OCR stats, and these are expressed in the ALTO files. For each page there is a predicted ...
