| Evaluation seq. num.
|| Use only of sub-sequent evaluations of the same evaluation is done in another setup than a previous one.
In that case copy the Evaluation specs table and fill out a new one with a new sequence number.
For the first evaluation leave this field at "1"
|Evaluator-IDfirstname.lastname@example.org|| Unique ID of the evaluator that carried out this specific evaluator.
|Evaluation describtion||text|| Matchbox evaluation applied to a data set of 40 books (~330 page images per book) from the Austrian Books online collection of the Austrian National Library. The performance of duplicate page detection is determined by the average runtime of Matchbox per book. The Taverna Workflow Workbench is used in batch processing mode without hadoop. Matchbox is executed on a server with 4 physical cores, and Taverna is configured to process 4 books in parallel. Additionally, an evaluation of the correctness of the Matchbox duplicate detection has been performed using a small sample of 7 books with ground indicating which pages should be identified as duplicate pages.
|| Textual description of the evaluation and the overall goals
|Evaluation-Date||DD/MM/YY||28/11/12|| Date of evaluation
|| 40 books from the Austrian Books online collection of the Austrian National Library
|| Link to dataset page(s) on WIKI
For each dataset that is a part of an evaluation
make sure that the dataset is described here: Datasets
|Workflow method|| string
|| Taverna Workflow Workbench, batch processing using "tool" service components, 4 processes in parallel server with 4 physical cores (without hadoop).
|| Taverna / Commandline / Direct hadoop etc...
| Workflow(s) involved
|| Link(s) to MyExperiment if applicable
| Tool(s) involved
|| Link(s) to distinct versions of specific components/tools in the component registry if applicable
|Link(s) to Scenario(s)|| URL(s)
|| LSDRT11 Duplicate image detection within one book
|| Link(s) to scenario(s) if applicable
|Description||String||FUE-L Rack Server at ONB|| Unique string that identifies this specific platform.
Use the platform name
|Total number of physical CPUs||integer||1||Number of CPU's involved|
|CPU specs||string||Intel(R) Xeon(R) CPU, E5540 @ 2.53GHz|| Specification of CPUs
|Total number of CPU-cores||integer||4|| Number of CPU-cores involved
| Total amount of RAM in Gbytes
||integer||12GB|| Total amount of RAM on all nodes
| Operating System
||String|| Ubuntu Linux Server 12.04.1 LTS
||Linux (specific distribution), Windows (specific distribution), other?|
|Storage system/layer||String||NFS||NFS, HDFS, local files, ?|
metrics must come from / be registered in the metrics catalogue
|Metric||Baseline definition||Baseline value||Goal|| Evaluation 1 (28/11/12)
|| Evaluation 2 (date)
|| Evaluation 3 (date)
|NumberOfObjectsPerHour|| Number of books (~330 page images per book) processed per hour.
||Average runtime of processing one book in hours.|| -
|NumberOfFailedFiles|| Number of book processings that failed in the workflow.
|IdentificationCorrectnessInPercent|| Average F-Measure (Precision and Recall combined)