View Source

h2.


h2. Evaluator(s)

_Alastair Duncan_


h2. Evaluation points

h5. Baseline

|| Metric || Description || Metric baseline || Metric goal || _20/06/2014_ _Maps per node 8 Split 1_ || _20/06/2014_ _Maps per node 4 Split 4_ || _20/06/2014_ _Maps per node 4 Split 50_ ||
| NumberOfObjectsPerHour | Number of objects processed in one hour | 479.3 \\ | \\ | 998.32 | 720 | 238.55 |
| MaxObjectSizeHandledInGbytes | Max size of raw files | 0.16689453 | | 0.16689453 | 0.16689453 | 0.16689453 |
| MinObjectSizeHandledInGbytes | Min size of raw files | 2.24113e-5 | | 2.24113e-5 | 2.24113e-5 | 2.24113e-5 |
| ThroughputGbytesPerMinute | The throughput of data measured in Gigabytes per minute | 0.246 | | 0.513 | 0.370 | 0.123 |
| ThroughputGbytesPerHour | The throughput of data measured in Gigabyte per hour | 14.764 | | 30.768 | 22.190 | 7.352 |
| ReliableAndStableAssessment | Manual assessment on if the experiment performed reliable and stable | true | | true | true | true |
| NumberOfFailedFiles | Number of files that failed in the workflow | 1\\ | | 1 | 1 | 1 |
| AverageRuntimePerItemInSeconds | The average processing time in seconds per item | 7.51 \\ | | 3.60 | 5.0 | 15.09 |
| throughput in bytes per second | The throughput of data measured in bytes per second | 4403436.169 \\ | | 9176908.992 \\ | 6618498.000 \\ | 2191875.843 \\ |
| number of  objects per second \\ | Number of objects that can be processed per second | 0.13 | | 0.28 \\ | 0.20 \\ | 0.07 |
| max object size handled in bytes | Max size of raw files | 24064 \\ | | 24064 | 24064 | 24064 |
| min object size handled in bytes | Min size of raw files | 160312320 | | 160312320 | 160312320 | 160312320 |

h5.






h5.


h5.


h5. WebDAV

[http://fue.onb.ac.at/scape-tb-evaluation/stfc/raw2nexus/smalldatasetmap4split50/]

[http://fue.onb.ac.at/scape-tb-evaluation/stfc/raw2nexus/smalldatasetmap4split4/]

[http://fue.onb.ac.at/scape-tb-evaluation/stfc/raw2nexus/smalldatasetmap8split1/]




h2. Evaluation notes

_Timings for moving the data onto hdfs and generating the input files for ToMaR are not included in any of the evaluations. Baseline results are for the small dataset with the non Hadoop workflow executed using Taverna on a single node from the Hadoop cluster. Timings for baseline included Taverna overheads, no stage timings are available from CLI version of Taverna 2.4. Taverna overheads were not included in Hadoop experiments._

_One failure due to missing metadata file for one of the test files._