h2. Status
{tip:title=Active}
h2. Investigator(s)
Clemens Neudecker, KB
h2. Dataset
One [sample batch|http://wiki.opf-labs.org/display/SP/KB+Metamorfoze+Migration+%28sample+batch%29] from the [Metamorfoze|http://www.metamorfoze.nl/english] project.
h2. Platform
[KB 1|http://wiki.opf-labs.org/display/SP/KB+Hadoop+Platform] Hadoop Platform
h2. Workflow
The migration is implemented as a [batch file|https://github.com/KBNLresearch/hadoop-jp2-experiment/blob/master/run.sh], in [Java code|https://github.com/KBNLresearch/hadoop-jp2-experiment], and as a [Taverna|http://www.taverna.org.uk/] workflow.
!migratie.png|border=1,width=600!
Latest Java code, workflow and batch files are available on [github|https://github.com/KBNLresearch/hadoop-jp2-experiment].
The workflow comprises of the following steps:
* Recover TIFF file from storage (HDFS)
* Run [Exiftool|http://www.sno.phy.queensu.ca/~phil/exiftool/] to extract metadata from TIFF
* Migrate TIFF \-> JP2 (using [Aware JP2K SDK|http://www.aware.com/imaging/jpeg2000sdk.html])
* Run [Exiftool|http://www.sno.phy.queensu.ca/~phil/exiftool/] to extract metadata from JP2
* Run [Jpylyzer|http://openplanets.github.io/jpylyzer/] over the JP2
* Run [Probatron validator|http://www.probatron.org/probatron4j.html] over Jpylyzer outputs to validate conformance of migrated image to the specified profile
* Use [GraphicsMagick|http://www.graphicsmagick.org/] to compare TIFF and JP2
* Create report
* Create output package (JP2, results, etc)
* Post files back to relevant storage
h2. Requirements and Policies
* The JP2 files produced by the migration must be valid JP2s (checked with Jpylyzer)
* The JP2 files produced by the migration must adhere to a specific profile (checked with Probatron against Jpylyzer output)
* The Pixel-comparison with the original TIFF image must return identical results (checked with graphicsmagick)
h2. Evaluations
|| Metric \\ || PW catalogue \\
URI || Datatype \\ || Description \\ || Value \\ || Comments \\ ||
| NumberOfObjectsPerHour | | integer | Number of objects that can be processed per hour \\ | 1341 | Could be used both for component evaluations on a single machine and on entire platform setups \\ |
| MigrationCorrectnessInPercent | | integer \\ | Defining a statistical measure for binary evaluations | 100 % \\ | Between 0 and 100 \\ |
| ThroughputGbytesPerMinute \\ | | float \\ | The throughput of data measured in Gybtes per minute \\ | 0,47 | Specify in Gbytes per minute \\ |
| ThroughputGbytesPerHour | | float \\ | The throughput of data measured in Gbytes per hour \\ | 28,16 \\ | Specify in Gbytes per hour \\ |
| ReliableAndStableAssessment | | boolean \\ | Manual assessment on if the experiment performed reliable and stable \\ | true \\ | |
| NumberOfFailedFiles | | integer \\ | Number of files that failed in the workflow \\ | 0 \\ | |
| NumberOfFailedFilesAcceptable | | boolean | Manual assessment of whether the number of files that fail in the workflow is acceptable \\ | true \\ | |
| QAFalseDifferentPercent | | integer | Number of content comparisons resulting in _original and migrated different_, even though human spot checking says _original and migrated similar_. | 0 % \\ | Between 0 and 100 |
| AverageRuntimePerItemInHours \\ | | float | The average processing time in hours per item \\ | 0.00099 \\ | Positive floating point number \\ |
| | | | | | |
| AwareCompressRuntimeAvg | | integer | Average running time of Aware jp2_compress in milliseconds | 2377 | |
| JpylyzerCheckRuntimeAvg | | integer | Average running time of Jpylyzer validation in milliseconds | 212 | |
| ProbatronCheckRuntimeAvg | | integer | Average running time of Probatron profile validation in milliseconds \\ | 1628 | |
| KakaduExpandRuntimeAvg | | integer | Average running time of Kakadu kdu_expand in milliseconds \\ | 2087 | |
| GMCompareRuntimeAvg | | integer | Average running time of GraphicsMagick pixel comparison in milliseconds \\ | 381 | |
| AwareCompressRuntimeFull \\ | | | Total running time of Aware jp2_compress | 5:18,46 | |
| JpylyzerCheckRuntimeFull \\ | | | Total running time of Jpylyzer validation \\ | 0:28,25 | |
| ProbatronCheckRuntimeFull \\ | | | Total running time of Probatron profile validation \\ | 3:38,18 | |
| KakaduExpandRuntimeFull \\ | | | Total running time of Kakadu kdu_expand \\ | 4:39,54 | |
| GMCompareRuntimeFull \\ | | | Total running time of GraphicsMagick pixel comparison \\ | 0:51,07 | |
| AwareCompressSuccess \\ | | | Success rate of Aware jp2_compress | 100 % | |
| JpylyzerCheckSuccess \\ | | | Success rate of Jpylyzer validation \\ | 100 % | |
| ProbatronCheckSuccess \\ | | | Success rate of Probatron profile validation \\ | 100 % | |
| KakaduExpandSuccess \\ | | | Success rate of Kakadu kdu_expand \\ | 100 % | |
| GMCompareSuccess \\ | | | Success rate of GraphicsMagick pixel comparison \\ | 100 % | |
{tip:title=Active}
h2. Investigator(s)
Clemens Neudecker, KB
h2. Dataset
One [sample batch|http://wiki.opf-labs.org/display/SP/KB+Metamorfoze+Migration+%28sample+batch%29] from the [Metamorfoze|http://www.metamorfoze.nl/english] project.
h2. Platform
[KB 1|http://wiki.opf-labs.org/display/SP/KB+Hadoop+Platform] Hadoop Platform
h2. Workflow
The migration is implemented as a [batch file|https://github.com/KBNLresearch/hadoop-jp2-experiment/blob/master/run.sh], in [Java code|https://github.com/KBNLresearch/hadoop-jp2-experiment], and as a [Taverna|http://www.taverna.org.uk/] workflow.
!migratie.png|border=1,width=600!
Latest Java code, workflow and batch files are available on [github|https://github.com/KBNLresearch/hadoop-jp2-experiment].
The workflow comprises of the following steps:
* Recover TIFF file from storage (HDFS)
* Run [Exiftool|http://www.sno.phy.queensu.ca/~phil/exiftool/] to extract metadata from TIFF
* Migrate TIFF \-> JP2 (using [Aware JP2K SDK|http://www.aware.com/imaging/jpeg2000sdk.html])
* Run [Exiftool|http://www.sno.phy.queensu.ca/~phil/exiftool/] to extract metadata from JP2
* Run [Jpylyzer|http://openplanets.github.io/jpylyzer/] over the JP2
* Run [Probatron validator|http://www.probatron.org/probatron4j.html] over Jpylyzer outputs to validate conformance of migrated image to the specified profile
* Use [GraphicsMagick|http://www.graphicsmagick.org/] to compare TIFF and JP2
* Create report
* Create output package (JP2, results, etc)
* Post files back to relevant storage
h2. Requirements and Policies
* The JP2 files produced by the migration must be valid JP2s (checked with Jpylyzer)
* The JP2 files produced by the migration must adhere to a specific profile (checked with Probatron against Jpylyzer output)
* The Pixel-comparison with the original TIFF image must return identical results (checked with graphicsmagick)
h2. Evaluations
|| Metric \\ || PW catalogue \\
URI || Datatype \\ || Description \\ || Value \\ || Comments \\ ||
| NumberOfObjectsPerHour | | integer | Number of objects that can be processed per hour \\ | 1341 | Could be used both for component evaluations on a single machine and on entire platform setups \\ |
| MigrationCorrectnessInPercent | | integer \\ | Defining a statistical measure for binary evaluations | 100 % \\ | Between 0 and 100 \\ |
| ThroughputGbytesPerMinute \\ | | float \\ | The throughput of data measured in Gybtes per minute \\ | 0,47 | Specify in Gbytes per minute \\ |
| ThroughputGbytesPerHour | | float \\ | The throughput of data measured in Gbytes per hour \\ | 28,16 \\ | Specify in Gbytes per hour \\ |
| ReliableAndStableAssessment | | boolean \\ | Manual assessment on if the experiment performed reliable and stable \\ | true \\ | |
| NumberOfFailedFiles | | integer \\ | Number of files that failed in the workflow \\ | 0 \\ | |
| NumberOfFailedFilesAcceptable | | boolean | Manual assessment of whether the number of files that fail in the workflow is acceptable \\ | true \\ | |
| QAFalseDifferentPercent | | integer | Number of content comparisons resulting in _original and migrated different_, even though human spot checking says _original and migrated similar_. | 0 % \\ | Between 0 and 100 |
| AverageRuntimePerItemInHours \\ | | float | The average processing time in hours per item \\ | 0.00099 \\ | Positive floating point number \\ |
| | | | | | |
| AwareCompressRuntimeAvg | | integer | Average running time of Aware jp2_compress in milliseconds | 2377 | |
| JpylyzerCheckRuntimeAvg | | integer | Average running time of Jpylyzer validation in milliseconds | 212 | |
| ProbatronCheckRuntimeAvg | | integer | Average running time of Probatron profile validation in milliseconds \\ | 1628 | |
| KakaduExpandRuntimeAvg | | integer | Average running time of Kakadu kdu_expand in milliseconds \\ | 2087 | |
| GMCompareRuntimeAvg | | integer | Average running time of GraphicsMagick pixel comparison in milliseconds \\ | 381 | |
| AwareCompressRuntimeFull \\ | | | Total running time of Aware jp2_compress | 5:18,46 | |
| JpylyzerCheckRuntimeFull \\ | | | Total running time of Jpylyzer validation \\ | 0:28,25 | |
| ProbatronCheckRuntimeFull \\ | | | Total running time of Probatron profile validation \\ | 3:38,18 | |
| KakaduExpandRuntimeFull \\ | | | Total running time of Kakadu kdu_expand \\ | 4:39,54 | |
| GMCompareRuntimeFull \\ | | | Total running time of GraphicsMagick pixel comparison \\ | 0:51,07 | |
| AwareCompressSuccess \\ | | | Success rate of Aware jp2_compress | 100 % | |
| JpylyzerCheckSuccess \\ | | | Success rate of Jpylyzer validation \\ | 100 % | |
| ProbatronCheckSuccess \\ | | | Success rate of Probatron profile validation \\ | 100 % | |
| KakaduExpandSuccess \\ | | | Success rate of Kakadu kdu_expand \\ | 100 % | |
| GMCompareSuccess \\ | | | Success rate of GraphicsMagick pixel comparison \\ | 100 % | |