Extracting and aggregating metadata with Tika
Apache Tika was used with a custom wrapper to extract metadata (e.g. author, title, extent, dates and file formats) and content (text) from collection files. A Java script was then used to produce report that summarised the metadata and content from the collection. This information will be used to inform collection management decisions and identify potential preservation issues.
A link to code on Git hub or a corresponding myExperiment if applicable
Tool Registry Link
Add an entry to the OPF Tool Registry, and provide a link to it here.
Any notes or links on how the solution performed.