Skip to end of metadata
Go to start of metadata
Detecting Encryption/DRM in Digital Content
Detailed description Many file formats make provision for the encryption of content, e.g. password protected PDFs. Outside of formats software exists that will encrypt data at a file, directory, and device level, e.g. encrypted hard drives. Encrypted content is not suitable for long term preservation purposes because the content is inaccessible. This issue will not be solved by a single tool as many forms of encryption are format specific, instead a collection of tools will be needed.
Digital Rights Management is closely to encryption and is usually used by content producers trying to ensure that content is only accessible by legitimate (paying) users.
Scalability Challenge
The large variety of encryption techniques employed across many formats makes this a complex issue.
Issue champion Maureen Pennock (BL)
Other interested parties
Potentially many, it's a generic and common issue.
Possible Solution approaches PDF Box for PDF encryption.
Apache POI for Office docs, or talk to MS research.
Java zip library for container formats (e.g. zip).
Calibre to detect DRM in Ebook formats
Context Details of the institutional context to the Issue. (May be expanded at a later date)
Lessons Learned Notes on Lessons Learned from tackling this Issue that might be useful to inform the development of Future Additional Best Practices, Task 8 (SCAPE TU.WP.1 Dissemination and Promotion of Best Practices)
Training Needs  
Datasets A requirement to build a dataset of sample encrypted content in different formats.
Solutions Reference to the appropriate Solution page(s), by hyperlink


Objectives Which scape objectives does this issues and a future solution relate to? e.g. scaleability, rubustness, reliability, coverage, preciseness, automation
Success criteria Describe the success criteria for solving this issue - what are you able to do? - what does the world look like?
Automatic measures What automated measures would you like the solution to give to evaluate the solution for this specific issue? which measures are important?
If possible specify very specific measures and your goal - e.g.
 * process 50 documents per second
 * handle 80Gb files without crashing
 * identify 99.5% of the content correctly
Manual assessment Apart from automated measures that you would like to get do you foresee any necessary manual assessment to evaluate the solution of this issue?
If possible specify measures and your goal - e.g.
 * Solution installable with basic linux system administration skills
 * User interface understandable by non developer curators
Actual evaluations links to acutual evaluations of this Issue/Scenario

issue issue Delete
obsolescence obsolescence Delete
characterisation characterisation Delete
lsdr lsdr Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.