
Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (14)
View Page Historyh2. Metrics Ccatalogue
To unify metrics across all evaluations all metrics should be registered in this Metrics Catalogue. So  when picking metrics for an evaluation run through the catalogue and pick any already defined or enter a new metric when needed.
h4. Picking metrics
{code}Use CamelCase notation for metric names  e.g. NumberOfObjectsPerHour{code}
When picking metrics for an evaluation, run through the catalogue and pick any already defined, or enter a new metric when needed.
The attribute/measure catalogue developed in PW can be found here [Measures http://ifs.tuwien.ac.at/dp/vocabulary/quality/measures]
Also, an equivalent attribute/measure source can be found in this google doc [Measures by google doc https://docs.google.com/spreadsheet/ccc?key=0An_F2fZCFRRtdGZ6NFg0eFI3b3NIdktMSzBtWmhKUHc&pli=1#gid=0] (write to Kresimir Duretec for access to the google doc).
Also, an equivalent attribute/measure source can be found in this google doc [Measures by google doc https://docs.google.com/spreadsheet/ccc?key=0An_F2fZCFRRtdGZ6NFg0eFI3b3NIdktMSzBtWmhKUHc&pli=1#gid=0] (write to Kresimir Duretec for access to the google doc).
 Metric \\  Datatype \\  Description \\  Example \\  Comments \\ 
 NumberOfObjectsPerHour  integer  Number of objects that can be processed per hour \\  250 \\  Could be used both for component evaluations on a single machine and on entire platform setups \\ 
 IdentificationCorrectnessInPercent  integer \\  Defining a statistical measure for binary evaluations  [see detailed specification below#Metricscataloguefmeasure]  85 % \\  Between 0 and 100 \\ 
 MaxObjectSizeHandledInGbytes \\  integer \\  The max file size a workflow/component has handled \\  80 \\  Specify in Gbytes \\ 
 PlanEfficiencyInHours  integer \\  Number of hours it takes to build one preservation plan with Plato \\  20 \\  Specify in hours \\ 
 ThroughputGbytesPerMinute \\  integer \\  The throughput of data measured in Gybtes per minute \\  5 \\  Specify in Gbytes per minute \\ 
 ReliableAndStableAssessment  boolean \\  Manual asessment on if the experiment performed reliable and stable \\  true \\  
 NumberOfFailedFiles  integer \\  Number of files that failed in the workflow \\  0 \\  
 NumberOfObjectsPerHour  integer  Number of objects that can be processed per hour \\  250 \\  Could be used both for component evaluations on a single machine and on entire platform setups \\ 
 IdentificationCorrectnessInPercent  integer \\  Defining a statistical measure for binary evaluations  [see detailed specification below#Metricscataloguefmeasure]  85 % \\  Between 0 and 100 \\ 
 MaxObjectSizeHandledInGbytes \\  integer \\  The max file size a workflow/component has handled \\  80 \\  Specify in Gbytes \\ 
 PlanEfficiencyInHours  integer \\  Number of hours it takes to build one preservation plan with Plato \\  20 \\  Specify in hours \\ 
 ThroughputGbytesPerMinute \\  integer \\  The throughput of data measured in Gybtes per minute \\  5 \\  Specify in Gbytes per minute \\ 
 ReliableAndStableAssessment  boolean \\  Manual asessment on if the experiment performed reliable and stable \\  true \\  
 NumberOfFailedFiles  integer \\  Number of files that failed in the workflow \\  0 \\  
      h4. Metrics in use as of first round of evaluations
An attribute/measure catalogue is also developed in PW  this evaluation metrics catalogue will be merged with the PW catalogue in year3.
 Metric  Previously known as  URL 
 [number of objects per secondhttp://purl.org/DP/quality/measures#418]  NumberOfObjectsPerHour  http://purl.org/DP/quality/measures#418 
 [IdentificationCorrectnessInPercenthttp://purl.org/DP/quality/measures#417]  IdentificationCorrectnessInPercent  http://purl.org/DP/quality/measures#417 
 [max object size handled in byteshttp://purl.org/DP/quality/measures#404]  MaxObjectSizeHandledInGbytes  http://purl.org/DP/quality/measures#404 
 [min object size handled in byteshttp://purl.org/DP/quality/measures#405]  MinObjectSizeHandledInMbytes  http://purl.org/DP/quality/measures#405 
 [N/Ahttps://github.com/openplanets/policies/issues/6]  PlanEfficiencyInHours  see https://github.com/openplanets/policies/issues/6 
 [throughput in bytes per secondhttp://purl.org/DP/quality/measures#406]  ThroughputGbytesPerMinute  http://purl.org/DP/quality/measures#406 
 [throughput in bytes per secondhttp://purl.org/DP/quality/measures#406]  ThroughputGbytesPerHour  http://purl.org/DP/quality/measures#406 
 [stability judgementhttp://purl.org/DP/quality/measures#108]  ReliableAndStableAssessment  http://purl.org/DP/quality/measures#108 
 [failed objects in percenthttp://purl.org/DP/quality/measures#407]  NumberOfFailedFiles  http://purl.org/DP/quality/measures#407 
 [N/Ahttps://github.com/openplanets/policies/issues/11]  NumberOfFailedFilesAcceptable  see https://github.com/openplanets/policies/issues/11 
 [QAFalseDifferentPercenthttp://purl.org/DP/quality/measures#416]  QAFalseDifferentPercent  http://purl.org/DP/quality/measures#416 
 [N/Ahttps://github.com/openplanets/policies/issues/13]  AverageRuntimePerItemInHours  see https://github.com/openplanets/policies/issues/13 
 [number of objects per secondhttp://purl.org/DP/quality/measures#418]  NumberOfObjectsPerHour  http://purl.org/DP/quality/measures#418 
 [IdentificationCorrectnessInPercenthttp://purl.org/DP/quality/measures#417]  IdentificationCorrectnessInPercent  http://purl.org/DP/quality/measures#417 
 [max object size handled in byteshttp://purl.org/DP/quality/measures#404]  MaxObjectSizeHandledInGbytes  http://purl.org/DP/quality/measures#404 
 [min object size handled in byteshttp://purl.org/DP/quality/measures#405]  MinObjectSizeHandledInMbytes  http://purl.org/DP/quality/measures#405 
 [N/Ahttps://github.com/openplanets/policies/issues/6]  PlanEfficiencyInHours  see https://github.com/openplanets/policies/issues/6 
 [throughput in bytes per secondhttp://purl.org/DP/quality/measures#406]  ThroughputGbytesPerMinute  http://purl.org/DP/quality/measures#406 
 [throughput in bytes per secondhttp://purl.org/DP/quality/measures#406]  ThroughputGbytesPerHour  http://purl.org/DP/quality/measures#406 
 [stability judgementhttp://purl.org/DP/quality/measures#108]  ReliableAndStableAssessment  http://purl.org/DP/quality/measures#108 
 [failed objects in percenthttp://purl.org/DP/quality/measures#407]  NumberOfFailedFiles  http://purl.org/DP/quality/measures#407 
 [N/Ahttps://github.com/openplanets/policies/issues/11]  NumberOfFailedFilesAcceptable  see https://github.com/openplanets/policies/issues/11 
 [QAFalseDifferentPercenthttp://purl.org/DP/quality/measures#416]  QAFalseDifferentPercent  http://purl.org/DP/quality/measures#416 
 [N/Ahttps://github.com/openplanets/policies/issues/13]  AverageRuntimePerItemInHours  see https://github.com/openplanets/policies/issues/13 
If you want to have a quick glance at the PW catalogue its located here (google docs): [https://docs.google.com/spreadsheet/ccc?key=0An_F2fZCFRRtdGZ6NFg0eFI3b3NIdktMSzBtWmhKUHc&pli=1#gid=0]
Write to Christhop Becker at [[email protected]mailto:[email protected]] to ask for access to the google doc
If you already are familiar with the PW catalogue you are off cause most welcome to use already existing metrics from in there  this will make the merging in year3 much easier. But this is currently NOT a requirement.
{anchor:fmeasure}
...
This is one suggested way, which is nicely applicable, if we test for binary correctness of calculations, i.e. it is applicable for characterisation and QA.
This is one suggested way which is nicely applicable if we test for binary correctness of calculations, i.e. it is applicable for characterisation and QA
h2. History
h4. This is the previously used evaluation metrics
{code}Use CamelCase notation for metric names  e.g. NumberOfObjectsPerHour{code}
 Metric \\  PW catalogue \\
URI  Datatype \\  Description \\  Example \\  Comments \\ 
 NumberOfObjectsPerHour   integer  Number of objects that can be processed per hour \\  250 \\  Could be used both for component evaluations on a single machine and on entire platform setups \\ 
 IdentificationCorrectnessInPercent   integer \\  Defining a statistical measure for binary evaluations  [see detailed specification below#Metricscataloguefmeasure]  85 % \\  Between 0 and 100 \\ 
 MaxObjectSizeHandledInGbytes \\   integer \\  The max file size a workflow/component has handled \\  80 \\  Specify in Gbytes \\ 
 MinObjectSizeHandledInMbytes   integer  The min file size a workflow/component has handled  illustrates capability of running on heterogeneous file sizes when combined with MaxObjectSizeHandledInGbytes  20 \\  Specify in Mbytes 
 PlanEfficiencyInHours   integer \\  Number of hours it takes to build one preservation plan with Plato \\  20 \\  Specify in hours \\ 
 ThroughputGbytesPerMinute \\   integer \\  The throughput of data measured in Gybtes per minute \\  5 \\  Specify in Gbytes per minute \\ 
 ThroughputGbytesPerHour   integer \\  The throughput of data measured in Gbytes per hour \\  25 \\  Specify in Gbytes per minute \\ 
 ReliableAndStableAssessment   boolean \\  Manual assessment on if the experiment performed reliable and stable \\  true \\  
 NumberOfFailedFiles   integer \\  Number of files that failed in the workflow \\  0 \\  
 NumberOfFailedFilesAcceptable   boolean  Manual assessment of whether the number of files that fail in the workflow is acceptable \\  true \\  
 QAFalseDifferentPercent   integer  Number of content comparisons resulting in _original and migrated different_, even though human spot checking says _original and migrated similar_.  5% \\  Between 0 and 100 
 AverageRuntimePerItemInHours \\   float  The average processing time in hours per item \\  15 \\  Positive floating point number \\ 
h4. This is the previously used evaluation metrics
{code}Use CamelCase notation for metric names  e.g. NumberOfObjectsPerHour{code}
 Metric \\  PW catalogue \\
URI  Datatype \\  Description \\  Example \\  Comments \\ 
 NumberOfObjectsPerHour   integer  Number of objects that can be processed per hour \\  250 \\  Could be used both for component evaluations on a single machine and on entire platform setups \\ 
 IdentificationCorrectnessInPercent   integer \\  Defining a statistical measure for binary evaluations  [see detailed specification below#Metricscataloguefmeasure]  85 % \\  Between 0 and 100 \\ 
 MaxObjectSizeHandledInGbytes \\   integer \\  The max file size a workflow/component has handled \\  80 \\  Specify in Gbytes \\ 
 MinObjectSizeHandledInMbytes   integer  The min file size a workflow/component has handled  illustrates capability of running on heterogeneous file sizes when combined with MaxObjectSizeHandledInGbytes  20 \\  Specify in Mbytes 
 PlanEfficiencyInHours   integer \\  Number of hours it takes to build one preservation plan with Plato \\  20 \\  Specify in hours \\ 
 ThroughputGbytesPerMinute \\   integer \\  The throughput of data measured in Gybtes per minute \\  5 \\  Specify in Gbytes per minute \\ 
 ThroughputGbytesPerHour   integer \\  The throughput of data measured in Gbytes per hour \\  25 \\  Specify in Gbytes per minute \\ 
 ReliableAndStableAssessment   boolean \\  Manual assessment on if the experiment performed reliable and stable \\  true \\  
 NumberOfFailedFiles   integer \\  Number of files that failed in the workflow \\  0 \\  
 NumberOfFailedFilesAcceptable   boolean  Manual assessment of whether the number of files that fail in the workflow is acceptable \\  true \\  
 QAFalseDifferentPercent   integer  Number of content comparisons resulting in _original and migrated different_, even though human spot checking says _original and migrated similar_.  5% \\  Between 0 and 100 
 AverageRuntimePerItemInHours \\   float  The average processing time in hours per item \\  15 \\  Positive floating point number \\ 