One line summary | A utility based on Apache POI that is able to analyse MS Office documents. |
Detailed description | Uses POI to walk through the OLE file structures and look for embedded objects and their properties. |
Solution champion | ![]() |
Git link | https://github.com/openplanets/AQuA/tree/master/office-analyser![]() |
Group Evaluation Notes |
|
Detailed Evaluation | How well does the solution meet your issue? * This is actually a good start for further investigation of the issues which can arise with not only MS Word 97-2003 but also with other Office documents of the same period. Do you think you can implement the solution in your organisation? * If the solution is supported by DROID and JHOVE it would be easy to implement it in our organisation and our preservation workflows. Summarise the benefits to your organisation that the solution could provide? * If we know more about the Office files and application and platform on which they are created it is easier to decide on an preservation strategy. To know which embedded or linked objects are in the document is important for the migration of the object. It is now important to do further testing with more documents. |
Tool (link) | http://poi.apache.org/![]() |
Issue | Identifying the content of MS Office documents |
Labels: