Version 4 by Carl Wilson
on Dec 03, 2013 09:52.

compared with
Version 5 by Carl Wilson
on Dec 04, 2013 14:22.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (4)

View Page History

h2. DROID
Performing DROID format identification on single streams is not always easy. Look at the nanite code, which addresses some these difficulties, document the use of nanite and provide an exemplar. The nanite code cannot currently implement container characterisation. The group would like to see if this is possible.

h2. File
File suffers from two inefficiency issues, the need to create a shell sub-process and the requirement to operate on a file instance.
Makes multiple command line calls on individual files, though is a Java application. Make more efficient by patching in fixes to the above tools.

h1. Working plan
h1. Results

Rather than directly working on Hadoop tasks the group has really worked on the enhancement of the (nanite)[https://github.com/openplanets/nanite] and [FITS](https://github.com/harvard-lts/fits) code bases.


h2. DROID

So there has been some investi

h2. File
So we borrowed the JHOVE 2 JNA wrapper for file, removed the JHOVE dependencies and added one or two convenience methods. The initial code was placed into a [GitHub repository](https://github.com/openplanets/libmagic-jna-wrapper). For Java developers this offers the advantages of:

* A direct call to the libmagic library, avoiding the inefficiency of spawning a sub-process for magic identification.
* The possibility of performing stream based magic identification, meaning streams from container formats do not need serialising to temporary files.

h2. DROID

Te