Label: embedded_objects+issue+york_hackathon

All content with label embedded_objects+issue+york_hackathon.
Related Labels: word, opf_montpellier, jpg, audiovisual, qa, ocr, msg, audio, gif, binary, video, corruption, integrity, obsolescence, comparison, bmp, webarchive, apache, rights, more » ( - embedded_objects, - issue, - york_hackathon )

Page: Extracting embedded objects from docx files (Practical Preservation Issues)
Title \\ Extracting embedded objects from docx files Detailed description We preserve MS Word documents as docx files. We are reasonably confident that the XML structure preserves the report text and structure well. We are not so confident about ...
Page: Web based email "harvesting" (Practical Preservation Issues)
Title \\ Web based email "harvesting" Detailed description The setting is collecting private archives, more specific web based emails. It should be possible to automatically harvest emails from web based email accounts. The system should scale as the number ...
Other labels: email, harvesting, data_capture