Summary
Purpose | Detects and extracts metadata and text content from documents. |
Homepage |
http://tika.apache.org/![]() |
Source Code Repository |
https://github.com/apache/tika![]() |
License |
Apache License, Version 2.0 |
Debian Package |
Description
Java based tool for detecting and extracting metadata and text content from documents.
User Experiences
e.g. links to AQuA/SCAPE/Hackathon issues that use the tool
- IS25 Web Content Characterisation
- SO11 The Tika characterisation Tool
- SO17 Web Archive Mime-Type detection workflow based on Droid and Apache Tika
News Feeds
Release Feed
Link to any RSS feed that is updated when new releases occur, if any, e.g:
rss: javax.net.ssl.SSLException: Received fatal alert: protocol_version
Activity Feed
Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:
rss: javax.net.ssl.SSLException: Received fatal alert: protocol_version
Searching for Tika on OPF Labs
Found 9 search result(s) for Tika.
Page:
EVAL ARC2WARC-TOMAR with Tika
(SCAPE)
EVAL ARC2WARCTOMAR with Tika Evaluator(s) Sven Schlarb <[email protected]> Evaluation points Assessment of measurable points Metric ...
Mar 10, 2014
EVAL ARC2WARCTOMAR with Tika Evaluator(s) Sven Schlarb <[email protected]> Evaluation points Assessment of measurable points Metric ...
Mar 10, 2014
Page:
SO11 The Tika characterisation Tool
(SCAPE)
... versioning, so further work is needed to add versions to other mimetypes. Can Tika be extended to support regexp like Fido? Prototype of Tika (https://github.com/openplanets/tika https://github.com/openplanets/tika) up and running that used regular expressions
Jun 14, 2012
... versioning, so further work is needed to add versions to other mimetypes. Can Tika be extended to support regexp like Fido? Prototype of Tika (https://github.com/openplanets/tika https://github.com/openplanets/tika) up and running that used regular expressions
Jun 14, 2012
Page:
Tika Batch File Identification
(SPRUCE)
... investigation and feedback to Apache Tika. \\ \ Some files are only identified as application/octetstream (Tika default). Needs further investigation and feedback to Tika. \\ \ Some problems with character encoding of metadata returned by Tika causing issues when trying to load JSON output ...
Jun 13, 2012
Labels: spruce, spruce_glasgow, identification, solution
... investigation and feedback to Apache Tika. \\ \ Some files are only identified as application/octetstream (Tika default). Needs further investigation and feedback to Tika. \\ \ Some problems with character encoding of metadata returned by Tika causing issues when trying to load JSON output ...
Jun 13, 2012
Labels: spruce, spruce_glasgow, identification, solution
Page:
Example - Working with Apache Tika
(SCAPE)
... button) # Cloned the fork locally ($ git clone \recursive https://github.com/openplanets/tika) # cd into your local repository ($ cd tika) # Link with the upstream repository ($ git remote add upstream git://github.com/apache ... ...
Jan 24, 2012
... button) # Cloned the fork locally ($ git clone \recursive https://github.com/openplanets/tika) # cd into your local repository ($ cd tika) # Link with the upstream repository ($ git remote add upstream git://github.com/apache ... ...
Jan 24, 2012
Page:
Extracting and aggregating metadata with Apache Tika
(SPRUCE)
Extracting and aggregating metadata with Apache Tika Extracting and aggregating metadata with Tika At the Glasgow Mashup Peter May created a Python wrapper for Apache Tika. Carl Wilson extended this work, creating a Java utility class that wrapped Tika
Sep 28, 2012
Labels: spruce_london, solution, characterisation
Extracting and aggregating metadata with Apache Tika Extracting and aggregating metadata with Tika At the Glasgow Mashup Peter May created a Python wrapper for Apache Tika. Carl Wilson extended this work, creating a Java utility class that wrapped Tika
Sep 28, 2012
Labels: spruce_london, solution, characterisation
Page:
Parsing PST OST file using TIKA
(Knowledge Base)
Parsing PST OST file using TIKA Title Parsing PST OST file using TIKA Detailed description The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents ...
Jun 05, 2013
Labels: chapel_hill, solution, appraisal_assessment, characterisation
Parsing PST OST file using TIKA Title Parsing PST OST file using TIKA Detailed description The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents ...
Jun 05, 2013
Labels: chapel_hill, solution, appraisal_assessment, characterisation
Page:
Large scale document characterization and identification with Tika and DRIOID on SCAPE Azure platform
(SCAPE)
... tools, user should SCAPE Azure platform. We measured the speed of the Apache Tika Content Analysis Toolkit and the DROID File Format Identification Tool when they wereLarge scale document characterization and identification with Tika
Jul 15, 2014
... tools, user should SCAPE Azure platform. We measured the speed of the Apache Tika Content Analysis Toolkit and the DROID File Format Identification Tool when they wereLarge scale document characterization and identification with Tika
Jul 15, 2014
Page:
SO17 Web Archive Mime-Type detection workflow based on Droid and Apache Tika
(SCAPE)
... Involved tools: \\ \\ unARC: \\ A tool (by SB) to unpack ARC files. \\ \\ TIFOWA (using the TIKA API): \\ TIFOWA (by ONB) is using the TIKA API for extracting meta data from the files contained in an folder structure ...
Mar 01, 2012
Labels: identification, solution
... Involved tools: \\ \\ unARC: \\ A tool (by SB) to unpack ARC files. \\ \\ TIFOWA (using the TIKA API): \\ TIFOWA (by ONB) is using the TIKA API for extracting meta data from the files contained in an folder structure ...
Mar 01, 2012
Labels: identification, solution
Page:
PC.WP1 Tool tracker
(SCAPE)
PC.WP1 Tool tracker \ Tika package by BL \ UNIX file package by ? \ FITS package by ? \ ffprobe package by SB ...
Nov 14, 2012
PC.WP1 Tool tracker \ Tika package by BL \ UNIX file package by ? \ FITS package by ? \ ffprobe package by SB ...
Nov 14, 2012
Labels: