View Source

h3. After the event

Wiki pages capturing datasets, issues, and solutions: [http://wiki.opf-labs.org/label/REQ/preservingpdf]

Survey for attendee feedback: [https://www.surveymonkey.com/s/preservingpdf]

h3. Registration

Last chance to sign up\! Registration closes on *Friday 22 August*:{color:#222222} {color}[https://www.eventbrite.co.uk/e/preserving-pdf-identify-validate-repair-registration-12203790867 |https://www.eventbrite.co.uk/e/preserving-pdf-identify-validate-repair-registration-12203790867]

[OPF members|http://openplanetsfoundation.org/members] are invited free-of-charge (please use the code issued to your main point of contact at your organisation). Non-members are welcome at the rate of EUR 150.

h3. Date & Location

1-2 September 2014

German National Library of Economics, Hamburg

h3. Overview

This event will focus on the PDF file format. Participants are encouraged to contribute requirements, for instance sample files with errors or anomalies for investigation. Currently available identification and validation tools will be demonstrated, with the opportunity to compare results using your own collections and identify gaps for future development.
 
OPF members have identified specific tasks for the event:
* {color:#222222}check the validity of the files and whether they are encrypted;{color}
* {color:#222222}perform quality assurance checks after migration, using comparison tools; {color}
* {color:#222222}investigate error messages, repair the problems, and build a knowledge base; and{color}
* {color:#222222}document and improve open source tool functionality e.g. JHOVE validation.{color}

There will also be discussion sessions, and the opportunity to share experiences with peer organisations.



Olaf Drümmer, Chairman of the PDF Association, {color:#222222}CEO of callas software GmbH / DIN delegate to all PDF related working groups in ISO TC 171 and ISO TC 130 since 1999{color}, will present the work of the ISO standards body, including efforts related to PDF and PDF/A, and share the industry perspective on tool development.

h3. Why attend?

* Learn about PDF and PDF/A standards 
* Document and prioritise known preservation problems with PDF files
* Assess state of the art identification and validation tools
* Test the tools on sample files and compare the results 
* Define organisational requirements and policies for conformance
* Identify requirements for future development work (road-mapping)
* Help improve current PDF tools (hacking)

h3. Who should attend? 

* Collection owners with a responsibility to preserve PDFs. Bring along your problem files\! 
* Developers interested in hacking PDF identification and validation tools.

h3. Discussion topics

* PDF and PDF/A validation
* PDF standards and specifications
* PDF technical metadata
* Role of PDF/A as a preservation format


h3. Agenda


h4. 1 September



|| || Sessions || Parallel Session || Facilitators ||
| {color:#222222}09.30 - 10.00{color}\\ | {color:#222222}Coffee and registration{color}\\ | | Rebecca McGuinness, OPF \\ |
| 10:00 - 10:15 \\
\\
10:15 - 10:45 | {color:#222222}{*}Welcome & Housekeeping *{color}\\ {color:#222222}{*}Lightning talks: who you are, the story behind your sample data, and what you want from the event. *{color} | Plenary | Ed Fay, OPF & Yvonne Friese, ZBW \\
\\
Rebecca McGuinness, OPF |
| 10:45 - 11:45 | {color:#222222}{*}Talk: From PDF to PDF/A  -\- an overview of applicable specifications and ISO standards. *{color}\\ {color:#222222}{*}Which PDF versions *{color}{color:#222222}{*}are of interest today? What value does the family of PDF/A standards add to the PDF eco-system?*{color} | Plenary \\ | {color:#222222}Olaf Drümmer, Chairman of the PDF Association{color}\\ |
| 11:45 - 12:00 | {color:#222222}{_}Break{_}{color}\\ | | |
| 12:00 - 12:30 | {color:#222222}{*}Collaborating: Tools, test data and best practises for working together{*}{color}\\ {color:#222222}An overview of the event format and the approach we'll be taking to requirements gathering and open source development. This{color}\\ {color:#222222}will also include an {color}{color:#222222}introduction{color}{color:#222222} to the tools and test data we'll be using.{color} | Plenary | Carl Wilson, OPF |
| | | | |
| 12:30 - 13:30 | _Lunch_ | \\ | \\ |
| 13:30 - 15:00 | {color:#222222}Installing pre-packaged virtual machine with open-source digital preservation tools{color} | Plenary | OPF |
| 15:00 - 15:45 | {color:#222222}{*}Identify{*}{color}\\
* {color:#222222}A quick technical background to PDF format identification{color}
* {color:#222222}An evidence-based tool demonstration of file,{color} {color:#222222}[droid|http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/]{color}{color:#222222},{color} {color:#222222}[Apache Tika|http://tika.apache.org/]{color} | Plenary | Carl Wilson, OPF |
| 15:45 - 16:00 | _Coffee_ | | |
| 16:00\- 18:00 \\
\\
\\ | {color:#222222}{*}Prioritising tasks{*}{color}{color:#222222}:{color}\\
As a group, we'll prioritise the tasks below, and any additional tasks discussed in the lightening talks.  \\
We'll split in to teams to work and will begin by brainstorming requirements and approaches to   \\
address these challenges. {color:#222222}{*}Write up{*}{color}{color:#222222}: Develop plans and finalise achievable goals based on feedback from the group{color}\\ {color:#222222}{*}Presentations{*}{color}{color:#222222}: Teams present requirements and plan to the group for feedback{color}\\ | Hacking | All  \\
\\ |
| | _Close_ | | |
| 20:00 | Optional self-paid group dinner at ​Brauhaus Joh. Albrecht {color:#222222}​{color}\\ {color:#222222}A map to the restaurant is here: {color}[http://hamburg.brauhaus-joh-albrecht.de/index.php/anfahrtsskizze|http://hamburg.brauhaus-joh-albrecht.de/index.php/anfahrtsskizze]{color:#222222}. {color} | | |

h4. {color:#222222}2 September{color}

|| || Sessions || Parallel Sessions || Facilitators ||
| {color:#222222}09.00 - 09.15{color}\\ | Coffee | | |
| {color:#222222}09.15 - 09.30{color}\\ | Welcome back | | Ed Fay, OPF |
| 09.30 - 10.45 | {color:#222222}{*}Talk: What does it take to build a comprehensive PDF/A validator?*{color} {color:#222222}{*}Will any two validators ever agree? *{color}\\ {color:#222222}{*}And why is there no readily available reference implementation?*{color}{color:#222222} {color}\\
\\ {color:#222222}{*}Discussion{*}{color}{color:#222222}: How to validate and who validates the validators.{color} | Hacking | {color:#222222}Olaf Drümmer, Chairman of the PDF Association{color}\\ |
| 10:45 - 11:00 | {color:#222222}{_}Break{_}{color}\\ {color:#222222}_(sign up for optional sessions)_{color} | | |
| 11:00 - 12.00 | Write-up group work on wiki - prepare development tasks | Plenary | |
| 12:00 - 13:00 | {color:#222222}{*}Hacking and Optional sessions{*}{color}{color:#222222}:{color}\\
(1) Policy workflow talk from our hosts GoPortis (Michelle/Yvonne) \\ {color:#222222}Our hosts, will explain their organisation background and motivation behind their policy decision-making.{color} {color:#222222}They will also give practical examples of their current workflows with PDF files.{color}\\
\\
(2) Validation demonstrations using virtual environments (Carl) \\
* {color:#222222}Automating characterisation, PDF/A validation and policy conformance checking.{color}
* {color:#222222}An evidence based demonstration of JHOVE, Apache Tika, and Apache PDFBox/PreFlight. {color} | Hacking | Carl Wilson, OPF \\
\\
\\
\\
\\
\\
\\
Michelle Lindlar / Yvonne Friese, Goportis \\
\\
\\ |
| 13:00 - 14:00 | {color:#222222}{_}Lunch{_}{color}\\ | | |
| 14:00 \- 16:00 | *Hacking and Optional Sessions* \\
Continue to work on tools, or the optional sessions will run again if there is demand. \\ {color:#222222}Finish requirement write ups and progress on the wiki in teams.{color}\\ {color:#222222}Prepare presentations/demos to report back to the group.{color} | {color:#222222}Final hacking, check in code{color} {color:#222222}and prepare to report back to the group.{color} | All |
| 16:00 - 17:00 | {color:#222222}{*}Presentations & Coffee{*}{color}\\ {color:#222222}Each team will report back to the group on their progress and show demonstrations as appropriate.{color}\\
\\ {color:#222222}{*}Road-mapping{*}{color}\\
\\ {color:#222222}As a group we'll discuss what has worked well, and what hasn't worked and why.{color}\\ {color:#222222}We'll discuss how both requirements, and development actions arising from the event can be taken{color} \\ {color:#222222}forward by OPF and its members.{color}\\
\\ {color:#222222}Wrap up and next steps{color} | \\
Plenary session | All  \\
\\ |
| 17:00 | _Close_ | | |