Skip to end of metadata
Go to start of metadata


Sven Schlarb <[email protected]>

Evaluation points

Assessment of measurable points
Metric Description Metric baseline Metric goal March 04, 2014 (1000) March 04, 2014 (4924) evaluation date
NumberOfObjectsPerHour Number of objects processed in one hour
833,8098788   2760,736196 2813,267735  
MinObjectSizeHandledInGbytes Smallest ARC file in sample
0,001638618   0,001638618 0,0001516  
MaxObjectSizeHandledInGbytes Biggest ARC file in sample
0,295765739   0,295765739 0,295765739  
ThroughputGbytesPerMinute The throughput of data measured in Gybtes per minute 1,272703878   4,213909852 4,241862946  
ThroughputGbytesPerHour The throughput of data measured in Gbytes per hour 76,36223269   252,8345911 254,5117767  
ReliableAndStableAssessment Manual assessment on if the experiment performed reliable and stable true
  true true  
NumberOfFailedFiles Number of files that failed in the workflow 0
  0 0  
AverageRuntimePerItemInSeconds The average processing time in seconds per item 4,32   1,30 1,27965069  

Technical details

The different evaluation points in the table above refer to data sets of different size and parameter variations, the Hadoop Job-ID links to details about the job execution:

March 04, 2014 (1000): waa-full-arcs-1 (subset 1000), job_201401221447_0079

March 04, 2014 (4924): waa-full-arcs-1 (4924 arc files), job_201401221447_0078

These data samples are subsets of the ONB web archive crawl ONB Web Archive Dataset.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.