Page: Apache POI Office Document Analyser (AQuA)
One line summary A utility based on Apache POI that is able to analyse MS Office documents. Detailed description Uses POI to walk through the OLE file structures and look for embedded objects and their properties. \\ \\ \\ \\ \\ Solution champion anjackson Git link ...
Other labels: apache, poi, ms-office, office, ms, ole, word, excel
Page: Detect, extract and analyse embedded objects in PDFs (AQuA)
One line summary Detect and identify embedded objects in PDFs, then where appropriate extract and analyse analyse further \\ Detailed description The PDF specification is complex, and PDF files can contain other other objects, embedded at the file or page level ...
Other labels: pdf, objects, bmp, jpg, png, gif, tiff, pdfbox
Page: Extracting embedded objects from Office OpenXML documents (Practical Preservation Issues)
Title Extracting embedded objects from Office OpenXML documents Detailed description Overview: docXtractor is a python script using zipfile and lxml hooks to extract media from OOXML files (specifically docx in the current \\ alpha implementation). docXtractor parses ...
Page: Preserving MS Outlook (.msg) E-mails with Attachments - Solution (SPRUCE)
Title Preserving MS Outlook (.msg) Emails with Attachments \\ Detailed description The solution is a JAR executable which makes use of the msgparser Java library to extract binary attachments from Microsoft Outlook MSG files. Extracted ...
Other labels: msg, attachment, extractor, java, batch, windows, microsoft, spruce