Parsing PST OST file using TIKA
The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries. This solution uses Tika toolkit. http://tika.apache.org/
Fields used from the email messages in building the parser. The solution used the MBOX parser and modified it for the PST files.
- Has Attachment [true/false]
- Number of attachments
- Date Received
Create SAX events to run against parsed data.
"Parsing PST and OST email files for textual mining and searching
A link to code on Git hub or a corresponding myExperiment if applicable
Any notes or links on how the solution performed.