Skip to end of metadata
Go to start of metadata

Status

Active

Investigator(s)

Clemens Neudecker, KB

Dataset

One sample batch from the Metamorfoze project.

Platform

KB 1 Hadoop Platform

Workflow

The migration is implemented as a batch file, in Java code, and as a Taverna workflow.

Latest Java code, workflow and batch files are available on github.

The workflow comprises of the following steps:

  • Recover TIFF file from storage (HDFS)
  • Run Exiftool to extract metadata from TIFF
  • Migrate TIFF -> JP2 (using Aware JP2K SDK)
  • Run Exiftool to extract metadata from JP2
  • Run Jpylyzer over the JP2
  • Run Probatron validator over Jpylyzer outputs to validate conformance of migrated image to the specified profile
  • Use GraphicsMagick to compare TIFF and JP2
  • Create report
  • Create output package (JP2, results, etc)
  • Post files back to relevant storage

Requirements and Policies

  • The JP2 files produced by the migration must be valid JP2s (checked with Jpylyzer)
  • The JP2 files produced by the migration must adhere to a specific profile (checked with Probatron against Jpylyzer output)
  • The Pixel-comparison with the original TIFF image must return identical results (checked with graphicsmagick)

Evaluations

Metric
PW catalogue
URI
Datatype
Description
Value 
Comments
NumberOfObjectsPerHour   integer Number of objects that can be processed per hour
1341 Could be used both for component evaluations on a single machine and on entire platform setups
MigrationCorrectnessInPercent   integer
Defining a statistical measure for binary evaluations 100 %
Between 0 and 100
ThroughputGbytesPerMinute
  float 
The throughput of data measured in Gybtes per minute
0,47 Specify in Gbytes per minute
ThroughputGbytesPerHour   float
The throughput of data measured in Gbytes per hour
28,16
Specify in Gbytes per hour 
ReliableAndStableAssessment   boolean
Manual assessment on if the experiment performed reliable and stable
true
 
NumberOfFailedFiles   integer
Number of files that failed in the workflow
0
 
NumberOfFailedFilesAcceptable   boolean Manual assessment of whether the number of files that fail in the workflow is acceptable
true
 
QAFalseDifferentPercent   integer Number of content comparisons resulting in original and migrated different, even though human spot checking says original and migrated similar. 0 %
Between 0 and 100
AverageRuntimePerItemInHours
  float The average processing time in hours per item
0.00099
Positive floating point number
           
AwareCompressRuntimeAvg   integer Average running time of Aware jp2_compress in milliseconds 2377  
JpylyzerCheckRuntimeAvg   integer Average running time of Jpylyzer validation in milliseconds 212  
ProbatronCheckRuntimeAvg   integer Average running time of Probatron profile validation in milliseconds
1628  
KakaduExpandRuntimeAvg   integer Average running time of Kakadu kdu_expand in milliseconds
2087  
GMCompareRuntimeAvg   integer Average running time of GraphicsMagick pixel comparison in milliseconds
381  
AwareCompressRuntimeFull
    Total running time of Aware jp2_compress 5:18,46  
JpylyzerCheckRuntimeFull
    Total running time of Jpylyzer validation
0:28,25  
ProbatronCheckRuntimeFull
    Total running time of Probatron profile validation
3:38,18  
KakaduExpandRuntimeFull
    Total running time of Kakadu kdu_expand
4:39,54  
GMCompareRuntimeFull
    Total running time of GraphicsMagick pixel comparison
0:51,07  
AwareCompressSuccess
    Success rate of Aware jp2_compress 100 %  
JpylyzerCheckSuccess
    Success rate of Jpylyzer validation
100 %  
ProbatronCheckSuccess
    Success rate of Probatron profile validation
100 %  
KakaduExpandSuccess
    Success rate of Kakadu kdu_expand
100 %  
GMCompareSuccess
    Success rate of GraphicsMagick pixel comparison
100 %  
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.