Extraction of keywords (and images) from large collections of text based files

Skip to end of metadata
Go to start of metadata
Title
Extraction of keywords (and images) from large collections of text based files
Detailed description To facilitate a rapid initial categorisation of large hetergoneous collections of primarily text-based digital files/documents, a tool which parsed the documents, and presented a summary of (eg) the top 5 keywords (by wordcount) from each document, along with thumbnails of any images embedded in the document.
Issue champion Richard Freeston
Other interested parties
 
Possible Solution approaches Solution from previous mashup event: Analysis of Lucene Index Word Frequency
Context
Lessons Learned
Datasets
Solutions  
Labels:
issue issue Delete
spruce_glasgow spruce_glasgow Delete
spruce spruce Delete
unsolved_issue unsolved_issue Delete
appraisal_assessment appraisal_assessment Delete
unknown_characteristics unknown_characteristics Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.