Content with label pdf in AQuA (See content from all spaces)
Related Labels:
mj2, validation, characterisation, embedded_objects, jpg, jpxfilter, jp2k, png, document, jpeg2000, qa, jpm, xml, extraction, ocr, dependency, mets, java, acroform,
aqua, gif, alto, itext, office, embedded, solution, comparison, obsolescence, zip, issue, scape, taverna, bmp, metadata, structural_relationships, jpod, characterise, objects, tiff, fonts, jp2, jpx, api, dataset, pdfbox
more »
Page:
Born-digital - migration success
One line summary Checking whether an automated normalisation produces a surrogate of sufficient quality ... Detailed description "sufficient" obviously needs to be defined in terms of significant properties relevant to the context but are there some checks which can be run to determine whether ...
Other labels:
qa, comparison, characterise, office, issue
|
Page:
Check consistency between metadata and content
One line summary Check that the METS, OCR, JPEG2000 masters and the PDFs are consistent \\ Detailed description As shown in the diagram below, check images and ALTO files information defined in METS against the real files stored in separate Zip files. Also ...
Other labels:
mets, ocr, metadata, jpeg2000, jp2k, jp2, jpx, mj2
|
Page:
Detect, extract and analyse embedded objects in PDFs
One line summary Detect and identify embedded objects in PDFs, then where appropriate extract and analyse analyse further \\ Detailed description The PDF specification is complex, and PDF files can contain other other objects, embedded at the file or page level ...
Other labels:
objects, bmp, jpg, png, gif, tiff, pdfbox, jpxfilter
|
Page:
Embedded links within the PDF
One line summary Need to identify links embedded within PDFs and check whether they are still live   ...
Other labels:
issue, obsolescence, dependency
|
Page:
Embedded objects in PDFs
One line summary Need to detect embedded objects within PDFs   ...
Other labels:
issue, embedded_objects
|
Page:
Open Access PDFs
Basic description Open Access research outputs and etheses (sample being used from White Rose Research Online and White Rose eTheses Online)   ...
Other labels:
aqua, dataset, document
|
Page:
PDF Characterisation Tool
One line summary Java program to characterise PDF files, looking for preservation concerns.   ...
Other labels:
characterise, pdfbox, api, fonts, issue, acroform, embedded, jpeg2000
|
|
|