Newspaper issue dates

Skip to end of metadata
Go to start of metadata
One line summary For cataloguing purposes, it is of absolute importance that the issue data metadata is accurate. How can we ensure this? And can we predict where issues may be missing?
Detailed description Newspapers are structured by title, by year, and by issue in each year. The issues boundaries are determined at scan time, and confirmed in DocWorks, and the issue dates are added by a human parsing of the OCRed issue date (and comparison against the scanned image). How can we ensure that these are accurate? And how can we identify missing issues?
Issue champion Toby Atkin-Wright
Possible approaches Currently the Brightsolid project reads the issue date from each issue in a year, and predicts what the likely publication pattern was. Using the estimated publication pattern, it highlights issues that don't fit, and that need human QC investigation. The calculation of the publication pattern is naive, and could be significantly improved.
The issue order is also compared against the order of the originally scanned TIFF files; if the order does not match, the issues are highlighted for human QC investigation.
Once these issue dates have been confirmed or fixed, possible missing issues are identified based on the estimated publication pattern. This could be improved by taking account of the volume and issue numbers that are included in the METS files.
The volume and issue numbers could be used to confirm issue order, and to offer possible corrections where issues appear to be out of order.
Context Brightsolid
AQuA Solutions Newspaper issue dates - solution
Collections Brightsolid digitisation of British Library newspapers
metadata metadata Delete
ocr ocr Delete
issue issue Delete
unknown_characteristics unknown_characteristics Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.