Sven Schlarb <>

Evaluation points

Assessment of measurable points
Metric Description Metric baseline Metric goal May 13, 2014 (100)
NumberOfObjectsPerHour Number of objects processed in one hour
0,5605381166   9,7328863415
MinObjectSizeHandledInGbytes Smallest ARC file in sample
0,0001516324   0,0001516324
MaxObjectSizeHandledInGbytes Biggest ARC file in sample
0,0010629473   1,2779601114
ThroughputGbytesPerMinute The throughput of data measured in Gybtes per minute 0,00000495   0,0020730401
ThroughputGbytesPerHour The throughput of data measured in Gbytes per hour 0,0002967503   0,1243824051
ReliableAndStableAssessment Manual assessment on if the experiment performed reliable and stable true
NumberOfFailedFiles Number of files that failed in the workflow 0
AverageRuntimePerItemInSeconds The average processing time in seconds per item 6422,4   369,88

Given these numbers, the experimental platform available at ONB would not be sufficient to process the web archive data summing up to a total of over 40 Terabytes at the time of running this experiment.

Technical details

May 13, 2014 (100): 100 arc files, Processing time 10:16:28 (hh:mm:ss), 1,28 GB (1308,63 MB,1372199221 Bytes) 

These data samples are subsets of the ONB web archive crawl ONB Web Archive Dataset.

