Checksumming

Skip to end of metadata
Go to start of metadata

Checksumming

Detailed description

As part of the development of the preservation strategy there is also a need to map duplicates and check for authenticity and integrity of files.  Part of this process is looking at checksumming.  This involves using tools such as Fastsum:

http://www.fastsum.com/

and looking at how this information is stored and built into the business workflow.

'md5sum' could also be used through Cygwin command line on a Windows environment to provide (for free) checksums of files.  This could be used in combination with 'find' and 'xargs' to generate a manifest.

It may also be worthwhile considering BagIt http://en.wikipedia.org/wiki/BagIt which enables a separation between the data to be manifested and the manifest (metadata) itself whilst maintaining a link between these files.

Issue champion

Rachel MacGregor

Other interested parties
Any other parties who are also interested in applying Issue Solutions to their Datasets.

Possible Solution approaches

  • testing out preservation planning tools
  • suggestions of what worked/what didn't work from other developers/practitioners

Context

This is part of a wider attempt to develop a robust preservation strategy to a small but growing digital repository in an underfunded but large local authority.  The dataset is a small sample from a larger collection of digital assets with a variety of file types, descriptive metadata etc.

Lessons Learned
Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice

Datasets
Vanley Burke Archive - sample for digital asset audit

Solutions
Reference to the appropriate Solution page(s), by hyperlink.

Labels:
spruce_london spruce_london Delete
issue issue Delete
integrity integrity Delete
duplication duplication Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.