Metadata extraction

Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

Metadata extraction

Creating a code for extracting metadata - and in particular descriptive metadata from files.

The data set is representative of the whole digital repository.  We have identified that we can only manage the collections by conducting an audit of what is already there and mapping and improving the workflow.  One of the ways in which this can be done is by running a code to map what metadata already exists within the existing files.

This will assist with an audit of all existing files/data and help to identify and prioritise where work needs to be done on improving/adding metadata.

Issues champion

Rachel MacGregor

Other interested parties

Any other parties who are also interested in applying Issue Solutions to their Datasets.

Possible Solution approaches
Brief brainstorm of possible approaches to solving the Issue. Each approach should be described in a single sentence as part of a bulleted list. Further detail can go in a dedicated Solution page.


The dataset comes from a digital repository in a large local authority archives service and represents a mixture of scanned images (for which a physical original exists) and born digital images (for which a physical original does not exist).  There is a mixture of file formats and a some or no descriptive metadata in the files.  Within the digital repository as a whole there are more varied file formats and types.  Very little work has been done on managing the collections and the priority is to focus on an audit of what is already there and developing a strategy to manage and develop the collections in the future with a view to implementing robust digital preservation strategies.

Lessons Learned
Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice

Reference to the appropriate Dataset page, by hyperlink. Note that all Issues MUST be linked to at least one Dataset!

Reference to the appropriate Solution page(s), by hyperlink.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.