Evaluation specs component level

Evaluation seq. num.
Use only of sub-sequent evaluations of the same evaluation is done in another setup than a previous one.
In that case copy the Evaluation specs table and fill out a new one with a new sequence number.
For the first evaluation leave this field at "1"
Evaluator-ID email [email protected]
Matchbox evaluation applied to a data set of 40 books (~330 page images per book)  from the Austrian Books online collection of the Austrian National Library. The performance of duplicate page detection is determined by the average runtime of Matchbox per book. The Taverna Workflow Workbench is used in batch processing mode without hadoop. Matchbox is executed on a server with 4 physical cores, and Taverna is configured to process 4 books in parallel. Additionally, an evaluation of the correctness of the Matchbox duplicate detection has been performed using a small sample of 7 books with ground indicating which pages should be identified as duplicate pages.
Textual description of the evaluation and the overall goals
Evaluation-Date DD/MM/YY 28/11/12
Dataset(s)
40 books from the Austrian Books online collection of the Austrian National Library
Link to dataset page(s) on WIKI
For each dataset that is a part of an evaluation
make sure that the dataset is described here: Datasets
Workflow method
Taverna Workflow Workbench, batch processing using "tool" service components, 4 processes in parallel server with 4 physical cores (without hadoop).
Taverna / Commandline / Direct hadoop etc...
Workflow(s) involved
Link(s) to MyExperiment if applicable
Tool(s) involved
Matchbox
Link(s) to distinct versions of specific components/tools in the component registry if applicable
Link(s) to Scenario(s)
LSDRT11 Duplicate image detection within one book
Link(s) to scenario(s) if applicable

Technical setup

Description String FUE-L Rack Server at ONB
Use the platform name
Total number of physical CPUs integer 1
CPU specs string Intel(R) Xeon(R) CPU, E5540  @ 2.53GHz
Total number of CPU-cores integer 4
Total amount of RAM in Gbytes
integer 12GB
Operating System
String Ubuntu Linux Server 12.04.1 LTS
Linux (specific distribution), Windows (specific distribution), other?
Storage system/layer String NFS


Evaluation points

metrics must come from / be registered in the metrics catalogue

Metric Baseline definition Baseline value Goal Evaluation 1 (28/11/12)
Evaluation 2 (date)
Evaluation 3 (date)
NumberOfObjectsPerHour Number of books (~330 page images per book) processed per hour.
1 0.18
1 0.18
Average runtime of processing one book in hours. -
1 5.4
1 5.4
NumberOfFailedFiles Number of book processings that failed in the workflow.
0 0
0 0    
IdentificationCorrectnessInPercent Average F-Measure (Precision and Recall combined)
