Fonts missing, damaged or incomplete

compared with
Current by Johan van der Knijff
on Jul 11, 2014 14:59.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (1)

View Page History
Based on a number of tests, non-embedded fonts usually appear to return error code 3.1.3, although the description of that error indicates that it may including other font issues as well. Also, the results of this [Analysis of Acrobat Engineering PDFs with Acrobat Preflight and Apache Preflight] indicate that in some cases non-embedded fonts may produce other error codes. This is all a bit unclear and may need further investigation.

h2. Recommendations

h3. Pre-ingest

* Formulate policy on how to deal with non-embedded, damaged or incomplete fronts.
* Use [Apache Preflight|Apache PDFBox] to check for font errors. Depending on the provenance of the PDFs this may result in many font errors being reported. As the meaning of Preflight's font error codes is not 100% clear, this may not be a viable solution (yet) in operational workflows.

h3. Existing collections

* Use [Apache Preflight|Apache PDFBox] to check for errors. However, this may not be a practical solution yet for the reason listed above.

h2. Example files
* [http://www.opf-labs.org/format-corpus/pdfCabinetOfHorrors/] - PDF Cabinet of Horrors on OPF Format Corpus