As part of the development of the preservation strategy there is also a need to map duplicates and check for authenticity and integrity of files. Part of this process is looking at checksumming. This involves using tools such as Fastsum:
and looking at how this information is stored and built into the business workflow.
'md5sum' could also be used through Cygwin command line on a Windows environment to provide (for free) checksums of files. This could be used in combination with 'find' and 'xargs' to generate a manifest.
It may also be worthwhile considering BagIt http://en.wikipedia.org/wiki/BagIt which enables a separation between the data to be manifested and the manifest (metadata) itself whilst maintaining a link between these files.
Other interested parties
Any other parties who are also interested in applying Issue Solutions to their Datasets.
Possible Solution approaches
- testing out preservation planning tools
- suggestions of what worked/what didn't work from other developers/practitioners
This is part of a wider attempt to develop a robust preservation strategy to a small but growing digital repository in an underfunded but large local authority. The dataset is a small sample from a larger collection of digital assets with a variety of file types, descriptive metadata etc.
Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice
Vanley Burke Archive - sample for digital asset audit
Reference to the appropriate Solution page(s), by hyperlink.