Title |
PDF to PDF-A conversion |
Detailed description | The process of converting pdf files to pdf/a is one of our most time-consuming tasks. It is frustrating and not always successful. Also, the process can not be run as a batch process. The conversion often fails (most common reason for failure is missing fonts). Some pdf files can not be converted at all. Is there a way of making this task easier? One way that might help is being able to report on a batch of pdfs and highlight those that are doomed to failure before we waste any time on them. |
Issue champion | ![]() |
Other interested parties |
Leeds / White Rose would be interested in solution for our repositories (would use it on etheses first then the research repository) |
Possible Solution approaches |
|
Context | We receive many files now (particularly grey literature library files) as pdf. These come in a variety of different versions. Some are secured, some are password protected, some have errors, some have missing fonts, some seem fine but still won't convert to pdf/a. We have about 200 of these to convert to pdf/a every month and it is one of our least favourite tasks! |
Lessons Learned | Notes on Lessons Learned from tackling this Issue that might be useful to inform digital preservation best practice |
Datasets | ADS Grey Literature Library eTheses |
Solutions | PDF to PDF-A Conversion Pre-Processor |