View Source

h1. Evaluation specs component level

|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Evaluation seq. num. \\ | int \\ | 1 \\ | Use only of sub-sequent evaluations of the same evaluation is done in another setup than a previous one. \\
In that case copy the Evaluation specs table and fill out a new one with a new sequence number. \\
For the first evaluation leave this field at "1" \\ |
| Evaluator-ID | email | baj@statsbiblioteket.dk | Unique ID of the evaluator that carried out this specific evaluator. \\ |
| Evaluation describtion | text | The overall goals is first to evaluate the stability of Taverna as the in-production workflow engine. Secondly we want to test FFprobe as characterisation, file format validation and property validation tool (in combination with schematron) for 'Large Video Files'. Thirdly we would like to test the Isilon Storage performance, when the scratch storage is moved from NAS/SAN to Isilon Storage. \\ | Textual description of the evaluation and the overall goals \\ |
| Evaluation-Date | DD/MM/YY | 29/07/13 | Date of evaluation \\ |
| Dataset(s) | string \\ | [Danish TV broadcasts, mpeg-2 transport stream|SP:Danish TV broadcasts, mpeg-2 transport stream]\\
Danish TV broadcast, H.264/MPEG-4 AVC \\
Danish Radio broadcast, MPEG1-Layer 2 | Link to dataset page(s) on WIKI \\
For each dataset that is a part of an evaluation \\
make sure that the dataset is described here: [SP:Datasets]\\ |
| Workflow method | string \\ | Taverna | Taverna / Commandline / Direct hadoop etc... \\ |
| Workflow(s) involved \\ | URL(s) \\ | [https://github.com/statsbiblioteket/youseeingestworkflow] | Link(s) to MyExperiment *if applicable* \\ |
| Tool(s) involved \\ | URL(s) | [Ffprobe|http://wiki.opf-labs.org/display/TR/Ffprobe] (a large number of tools is involved, but the focus in this evaluation is Ffprobe) \\ | Link(s) to distinct versions of specific components/tools in the component registry *if applicable* \\ |
| Link(s) to Scenario(s) | URL(s) \\ | [Characterisation and validation of audio and video files during ingest] | Link(s) to scenario(s) *if applicable* \\ |

h1. Technical setup

|| Field \\ || Datatype \\ || Value \\ || Description \\ ||
| Description | String | [SP:SB Video File Ingest Platform] | Human readable description of the "platform" - e.g. Bjarnes Linux PC |
| Total number of physical CPUs | integer | 8 | Number of CPU's involved \\ |
| CPU specs | string | Intel® Xeon® Processor X5355 (8M Cache, 2.66 GHz, 1333 MHz FSB) | Specification of CPUs \\ |
| Total number of CPU-cores | integer | 32 | Number of CPU-cores involved \\ |
| Total amount of RAM in Gbytes \\ | integer | 32 | Total amount of RAM on all nodes \\ |
| Operating System \\ | String | Linux (CentOS release 6.3 (Final)) | Linux (specific distribution), Windows (specific distribution), other? |
| Storage system/layer | String | NAS/SAN + NFS | NFS, HDFS, local files, ? |

h1. Evaluation points

metrics must come from / be registered in the [metrics catalogue|Metrics Catalogue]











|| Metric || Baseline definition || Baseline value (24/07/13) || Goal || Evaluation 1 (date) \\ || Evaluation 2 (date) \\ || Evaluation 3 (date) \\ ||
| ReliableAndStableAssessment | *Reliability - Runtime stability*  (focus: Taverna workflow) \\
Manual assessment. The workflow is set up with a workflow monitor in which each step of the workflow is recorded. If a file does not complete the workflow, it is added to the downloadlist again and is run trhough the workflow again. It is also marked as 'failed' and can be viewed on the workflow monitor GUI, and a mail is sent to the digital content manager. If it later completes it is simply marked 'completed'. If the workflow cannot complete due to an inconsistency (e.g. wrong file format), the content provider is contacted. \\ | true | true \\ | | | |
| NumberOfFailedFilesAcceptable | *Reliability - Runtime stability* (focus: Ffprobe component) \\
Manual assessment. Ffprobe returns errors on broken files. This is acceptable as these files should not continue through the workflow, but rather be re-downloaded. Ffprobe may also miss audiotracks that are not present from the beginning of the video file, but added at a later time stamp. This means that the characterisation information from Ffprobe is not complete. We however also have characterisation information from a different tool, so we consider this acceptable. \\ | true \\ | true \\ | | | |
| MaxObjectSizeHandledInGbytes | *Performance efficiency - Capacity* (focus: Taverna workflow - all components) | 3.91Gb \\ | 5 | | | |
| MinObjectSizeHandledInMbytes | *Performance efficiency - Capacity* (focus: Taverna workflow - all components) | 113.79Mb \\ | 100 \\ | | | |
| ... \\ | | | | | | |
| NumberOfObjectsPerHour | *Performance efficiency - Capacity / Time behaviour* (focus: Taverna workflow) \\
The ingest workflows runs for approximately 8 hours a day and ingest between 800Gb and 1TB of radio and tv broadcasts. \\ | 76 | 100 \\ | | | |