View Source

*Title*

Document content and utility preservation


*Detailed description*

We need to ensure that documents are readable and remain so for as long as possible. The method needs to be quick and accurate. To this end we need to identify vulnerable documents so that we can allocate resources to making them durable and discover what techniques and tools are needed to accomplish this. We need to measure the problem to know what resources are needed and build a case for more as required.

*Issue champion*

[~aran]


*Other interested parties*
_Any other parties who are also interested in applying Issue Solutions to their Datasets._

*Possible Solution approaches*
* C3PO
* DROID
* eprints digital preservation plugin incorporating DROID
* PDFBox

*Context*


Eprints Research Repository at Middlesex University.

*Lessons Learned*

PDFBox preflight generates error reports on pdf files, but the importance of the errors needs to be investigated. There is a digital preservation plugin in the eprints bazaar which runs with DROID, also in the bazaar, which I am testing.


*Datasets*

[SPR:Middlesex University eprints repository full text documents]


*Solutions*
_Reference to the appropriate Solution page(s), by hyperlink._