h1. Evaluation specs component level
|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Evaluation seq. num. \\ | int \\ | 1 \\ | Use only of sub-sequent evaluations of the same evaluation is done in another setup than a previous one. \\
In that case copy the Evaluation specs table and fill out a new one with a new sequence number. \\
For the first evaluation leave this field at "1" \\ |
| Evaluator-ID | email | [email protected] \\ | Unique ID of the evaluator that carried out this specific evaluator. \\ |
| Evaluation describtion | text | Evaluating correctness of mimetype identification of Web Content on a controlled annotated corpora \\ | Textual description of the evaluation and the overall goals \\ |
| Evaluation-Date | DD/MM/YY | 20/07/2012 \\ | Date of evaluation \\ |
| Dataset(s) | string \\ | [SP:Govdocs1 Open Corpus]\\ | Link to dataset page(s) on WIKI \\
For each dataset that is a part of an evaluation \\
make sure that the dataset is described here: [SP:Datasets]\\ | |
| Workflow method | string \\ | commandline \\ | Taverna / Commandline / Direct hadoop etc... \\ | |
| Workflow(s) involved \\ | URL(s) \\ | N/A \\ | Link(s) to MyExperiment *if applicable* \\ |
| Tool(s) involved \\ | URL(s) | Apache TIKA 0.7 | Link(s) to distinct versions of specific components/tools in the component registry *if applicable* \\ |
| Link(s) to Scenario(s) | URL(s) \\ | [WCT4|SP:WCT4 Web Archive Mime-Type detection at Austrian National Library]\\ | Link(s) to scenario(s) *if applicable* \\ |
h1. Technical setup
|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Description | String | iapetus.statsbiblioteket.dk \\ | Human readable description of the "platform" - e.g. Bjarnes Linux PC |
| Total number of physical CPUs | integer | 2 \\ | Number of CPU's involved \\ |
| CPU specs | string | Intel® Xeon® Processor X5670 \\
(12M Cache, 2.93 GHz, 6.40 GT/s Intel® QPI) | Specification of CPUs \\ |
| Total number of CPU-cores | integer | 12 \\ | Number of CPU-cores involved \\ |
| Total amount of RAM in Gbytes \\ | integer | 96 \\ | Total amount of RAM on all nodes \\ |
| Operating System \\ | String | Linux \\ | Linux (specific distribution), Windows (specific distribution), other? |
| Storage system/layer | String | NFS mounted files \\ | NFS, HDFS, local files, ? |
| | | | |
| \\ | | | \\ |
h1. Evaluation points
metrics must come from / be registered in the [metrics catalogue|Metrics Catalogue]
|| Metric || Baseline definition\\ || Baseline value || Goal || Evaluation 1 (20/07/2012) \\ || Evaluation 2 (10/06/2013) \\ || Evaluation 3 (01/03/2014) \\ ||
| IdentificationCorrectnessInPercent \\ | Whats possible with Jhove2 default distribution\\ | 85 % \\ | 99 % \\ | 94 % \\ | 97 % \\ | 99 % \\ |
| | | | | | | |
| | | | | | | |
|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Evaluation seq. num. \\ | int \\ | 1 \\ | Use only of sub-sequent evaluations of the same evaluation is done in another setup than a previous one. \\
In that case copy the Evaluation specs table and fill out a new one with a new sequence number. \\
For the first evaluation leave this field at "1" \\ |
| Evaluator-ID | email | [email protected] \\ | Unique ID of the evaluator that carried out this specific evaluator. \\ |
| Evaluation describtion | text | Evaluating correctness of mimetype identification of Web Content on a controlled annotated corpora \\ | Textual description of the evaluation and the overall goals \\ |
| Evaluation-Date | DD/MM/YY | 20/07/2012 \\ | Date of evaluation \\ |
| Dataset(s) | string \\ | [SP:Govdocs1 Open Corpus]\\ | Link to dataset page(s) on WIKI \\
For each dataset that is a part of an evaluation \\
make sure that the dataset is described here: [SP:Datasets]\\ | |
| Workflow method | string \\ | commandline \\ | Taverna / Commandline / Direct hadoop etc... \\ | |
| Workflow(s) involved \\ | URL(s) \\ | N/A \\ | Link(s) to MyExperiment *if applicable* \\ |
| Tool(s) involved \\ | URL(s) | Apache TIKA 0.7 | Link(s) to distinct versions of specific components/tools in the component registry *if applicable* \\ |
| Link(s) to Scenario(s) | URL(s) \\ | [WCT4|SP:WCT4 Web Archive Mime-Type detection at Austrian National Library]\\ | Link(s) to scenario(s) *if applicable* \\ |
h1. Technical setup
|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Description | String | iapetus.statsbiblioteket.dk \\ | Human readable description of the "platform" - e.g. Bjarnes Linux PC |
| Total number of physical CPUs | integer | 2 \\ | Number of CPU's involved \\ |
| CPU specs | string | Intel® Xeon® Processor X5670 \\
(12M Cache, 2.93 GHz, 6.40 GT/s Intel® QPI) | Specification of CPUs \\ |
| Total number of CPU-cores | integer | 12 \\ | Number of CPU-cores involved \\ |
| Total amount of RAM in Gbytes \\ | integer | 96 \\ | Total amount of RAM on all nodes \\ |
| Operating System \\ | String | Linux \\ | Linux (specific distribution), Windows (specific distribution), other? |
| Storage system/layer | String | NFS mounted files \\ | NFS, HDFS, local files, ? |
| | | | |
| \\ | | | \\ |
h1. Evaluation points
metrics must come from / be registered in the [metrics catalogue|Metrics Catalogue]
|| Metric || Baseline definition\\ || Baseline value || Goal || Evaluation 1 (20/07/2012) \\ || Evaluation 2 (10/06/2013) \\ || Evaluation 3 (01/03/2014) \\ ||
| IdentificationCorrectnessInPercent \\ | Whats possible with Jhove2 default distribution\\ | 85 % \\ | 99 % \\ | 94 % \\ | 97 % \\ | 99 % \\ |
| | | | | | | |
| | | | | | | |