Automatically extracting metadata for Grey Literature reports

Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History
Automatically extracting metadata for Grey Literature reports
Detailed description Sometimes we receive batches of grey literature reports for which we don't have any metadata. This means we can not include them in the grey literature library because they will not be discoverable. The only solution to this we have used thus far is to open each report and create metadata by hand. We generally don't have time/money to do this so wait until we have a willing placement student! It would be great if this could be automated in some way. Metadata we need is quite detailed and hard to extract (location of archaeological fieldwork, monument types, artefacts and periods) but perhaps more achievable is getting basic details from title page (Report title, author/s, date produced, name of contracting unit/organisation)
Issue champion Jenny Mitcham (ADS) [email protected]
Other interested parties
Any other parties who are also interested in applying Issue Solutions to their Datasets
Possible Solution approaches
  • Check in the pdf or doc document properties to see if any useful metadata in there. Then check to see if these appear on title page of report to assess how reliable they are
Context Details of the institutional context to the Issue. (May be expanded at a later date)
Lessons Learned Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice
Datasets ADS Grey Literature Library
Solutions Reference to the appropriate Solution page(s), by hyperlink
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.