Metadata extraction

Skip to end of metadata
Go to start of metadata

Metadata extraction

Creating a code for extracting metadata - and in particular descriptive metadata from files.

The data set is representative of the whole digital repository.  We have identified that we can only manage the collections by conducting an audit of what is already there and mapping and improving the workflow.  One of the ways in which this can be done is by running a code to map what metadata already exists within the existing files.

This will assist with an audit of all existing files/data and help to identify and prioritise where work needs to be done on improving/adding metadata.

Issues champion

Rachel MacGregor

Other interested parties

Any other parties who are also interested in applying Issue Solutions to their Datasets.

Possible Solution approaches
Brief brainstorm of possible approaches to solving the Issue. Each approach should be described in a single sentence as part of a bulleted list. Further detail can go in a dedicated Solution page.


The dataset comes from a digital repository in a large local authority archives service and represents a mixture of scanned images (for which a physical original exists) and born digital images (for which a physical original does not exist).  There is a mixture of file formats and a some or no descriptive metadata in the files.  Within the digital repository as a whole there are more varied file formats and types.  Very little work has been done on managing the collections and the priority is to focus on an audit of what is already there and developing a strategy to manage and develop the collections in the future with a view to implementing robust digital preservation strategies.

Lessons Learned
Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice

Vanley Burke Archive - sample for digital asset audit


Distinguishing Files with Descriptive Metadata

spruce_london spruce_london Delete
issue issue Delete
unknown_characteristics unknown_characteristics Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.