View Source

| *Title* \\ | Detecting Encryption/DRM in Digital Content \\ |
| *Detailed description* | Many file formats make provision for the encryption of content, e.g. password protected PDFs. Outside of formats software exists that will encrypt data at a file, directory, and device level, e.g. encrypted hard drives. Encrypted content is not suitable for long term preservation purposes because the content is inaccessible. This issue will not be solved by a single tool as many forms of encryption are format specific, instead a collection of tools will be needed. \\
Digital Rights Management is closely to encryption and is usually used by content producers trying to ensure that content is only accessible by legitimate (paying) users. \\ |
| *Scalability Challenge* \\ | The large variety of encryption techniques employed across many formats makes this a complex issue. \\ |
| *[Issue champion|SP:Responsibilities of the roles described on these pages]* | [Maureen Pennock|] (BL) |
| *Other interested parties* \\ | Potentially many, it's a generic and common issue. |
| *Possible Solution approaches* | PDF Box for PDF encryption. \\
Apache POI for Office docs, or talk to MS research. \\
Java zip library for container formats (e.g. zip). \\
Calibre to detect DRM in Ebook formats \\ |
| *Context* | _Details of the institutional context to the Issue. (May be expanded at a later date)_ \\ |
| *Lessons Learned* | _Notes on Lessons Learned from tackling this Issue that might be useful to inform the development of Future Additional Best Practices, Task 8 (SCAPE TU.WP.1 Dissemination and Promotion of Best Practices)_ \\ |
| *Training Needs* | |
| *Datasets* | A requirement to build a dataset of sample encrypted content in different formats. |
| *Solutions* | _Reference to the appropriate Solution page(s), by hyperlink_ |

h1. Evaluation

| *Objectives* | _Which scape objectives does this issues and a future solution relate to? e.g. scaleability, rubustness, reliability, coverage, preciseness, automation_ |
| *Success criteria* | _Describe the success criteria for solving this issue - what are you able to do? - what does the world look like?_ |
| *Automatic measures* | _What automated measures would you like the solution to give to evaluate the solution for this specific issue? which measures are important?_ \\
_If possible specify very specific measures and your goal - e.g._ \\
_ \* process 50 documents per second_ \\
_ \* handle 80Gb files without crashing_ \\
_ \* identify 99.5% of the content correctly_ \\ |
| *Manual assessment* | _Apart from automated measures that you would like to get do you foresee any necessary manual assessment to evaluate the solution of this issue?_ \\
_If possible specify measures and your goal - e.g._ \\
_ \* Solution installable with basic linux system administration skills_ \\
_ \* User interface understandable by non developer curators_ \\ |
| *Actual evaluations* | links to acutual evaluations of this Issue/Scenario |