compared with
Current by Alastair Duncan
on Jun 27, 2014 11:35.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (3)

View Page History
| ThroughputGbytesPerHour | The throughput of data measured in Gigabyte per hour | 14.764 | | 30.768 | 22.190 | 7.352 |
| ReliableAndStableAssessment | Manual assessment on if the experiment performed reliable and stable | true | | true | true | true |
| NumberOfFailedFiles | Number of files that failed in the workflow | 0 1\\ | | 0 1 | 0 1 | 0 1 |
| AverageRuntimePerItemInSeconds | The average processing time in seconds per item | 7.51 \\ | | 3.60 | 5.0 | 15.09 |
| throughput in bytes per second | The throughput of data measured in bytes per second | 4403436.169 \\ | | 9176908.992 \\ | 6618498.000 \\ | 2191875.843 \\ |
| number of  objects per second \\ | Number of objects that can be processed per second | 0.13 | | 0.28 \\ | 0.20 \\ | 0.07 |
| max object size handled in bytes | Max size of raw files | 24064 \\ | | 24064 | 24064 | 24064 |
| min object size handled in bytes | Min size of raw files | 160312320 | | 160312320 | 160312320 | 160312320 |

_Timings for moving the data onto hdfs and generating the input files for ToMaR are not included in any of the evaluations. Baseline results are for the small dataset with the non Hadoop workflow executed using Taverna on a single node from the Hadoop cluster. Timings for baseline included Taverna overheads, no stage timings are available from CLI version of Taverna 2.4. Taverna overheads were not included in Hadoop experiments._

_One failure due to missing metadata file for one of the test files._