Skip to end of metadata
Go to start of metadata


Sven Schlarb <[email protected]>

Evaluation points

Assessment of measurable points
Metric Description Metric baseline Metric goal March 04, 2014 (1000) March 04, 2014 (4924)
NumberOfObjectsPerHour Number of objects processed in one hour
833,8098788   4591,836735 4645,283019
MinObjectSizeHandledInGbytes Smallest ARC file in sample
0,001638618   0,001638618 0,0001516
MaxObjectSizeHandledInGbytes Biggest ARC file in sample
0,295765739   0,295765739 0,295765739
ThroughputGbytesPerMinute The throughput of data measured in Gybtes per minute 1,272703878   7,008850061 7,004187217
ThroughputGbytesPerHour The throughput of data measured in Gbytes per hour 76,36223269   420,5310036 420,251233 (*)
ReliableAndStableAssessment Manual assessment on if the experiment performed reliable and stable true
  true true
NumberOfFailedFiles Number of files that failed in the workflow 0
AverageRuntimePerItemInSeconds The average processing time in seconds per item 4,32   0,78 0,774979691

Technical details

The different evaluation points in the table above refer to data sets of different size and parameter variations, the Hadoop Job-ID links to details about the job execution:

March 04, 2014 (1000): waa-full-arcs-1 (subset 1000), job_201401221447_0079

March 04, 2014 (4924): waa-full-arcs-1 (4924 arc files), job_201401221447_0078

These data samples are subsets of the ONB web archive crawl ONB Web Archive Dataset.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.