Disassociation of files and metadata

Skip to end of metadata
Go to start of metadata
Disassociation of files and metadata
Detailed description Each digitised page on the website must have a tif file, a htm file and a pdf file (plus other derivatives such as jpegs). These must all match each other (i.e. represent the same single page) and the Access DB metadata.

  • Can a tool be used or made to provide an inventory of pdf and htm files (using Windows Professional 7) which can be in Excel format and then added to the speadsheet above?
  • So there will be htm and pdf numbers in two columns which can be automatically checked against the tif number to flag up any missing files, duplicate or extra files.
Issue champion Francine Millard 
Other interested parties
Jenny Mitcham Larry Murray
Possible Solution approaches
  • are there any existing tools to do these tasks?
  • using Visual Basic to compare file names.
Context This would help the Digital Team deliver a more complete and accurate digital collection and will be of use to others who are involved in bulk file management and OCR management.
Lessons Learned TBC
Datasets India Papers Collection
Solutions File management and matching of tif, htm and pdf files solution
file file Delete
management management Delete
excel excel Delete
pdf pdf Delete
htm htm Delete
tif tif Delete
matching matching Delete
ocr ocr Delete
tool tool Delete
tiff tiff Delete
spruce_glasgow spruce_glasgow Delete
issue issue Delete
structural_relationships structural_relationships Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.