h2. Metrics Catalogue

To unify metrics across all evaluations all metrics should be registered in this Metrics Catalogue. So - when picking metrics for an evaluation run through the catalogue and pick any already defined or enter a new metric when needed.

{code}Use CamelCase notation for metric names - e.g. NumberOfObjectsPerHour{code}

|| Metric \\ || Datatype \\ || Description \\ || Example \\ || Comments \\ ||

| NumberOfObjectsPerHour | integer | Number of objects that can be processed per hour \\ | 250 \\ | Could be used both for component evaluations on a single machine and on entire platform setups \\ |

| IdentificationCorrectnessInPercent | integer \\ | Defining a statistical measure for binary evaluations - [see detailed specification below|#Metricscatalogue-fmeasure] | 85 % \\ | Between 0 and 100 \\ |

| MaxObjectSizeHandledInGbytes \\ | integer \\ | The max file size a workflow/component has handled \\ | 80 \\ | Specify in Gbytes \\ |

| PlanEfficiencyInHours | integer \\ | Number of hours it takes to build one preservation plan with Plato \\ | 20 \\ | Specify in hours \\ |

{anchor:fmeasure}

h2. Binary evaluation method (FMeasure)

We use _sensitivity_ and _specificity_ as statistical measures of the performance of the binary classification test where

_Sensitivity_ = Σ {color:#99cc00}true different{color} / (Σ{color:#99cc00} true different{color} \+ Σ {color:#ff0000}false similar{color})

and

_Specificity_ = Σ{color:#99cc00} true similar{color} / (Σ {color:#99cc00}true similar{color} \+ Σ{color:#ff0000} false different{color})

and the F-measure is calculated on this basis as shown in the table below:

!BinaryEvaluation.png|border=1,width=551,height=201!

This is one suggested way which is nicely applicable if we test for binary correctness of calculations, i.e. it is applicable for characterisation and QA

