This document covers and finishes the milestone MS77
Several things impact the current status for collecting evaluation data from testbeds
- The project is not you evolved enough to apply a fully automated approach. This means that the draft evaluation methodology largely will depend on manually picking up results (specific metrics coming out of experiments and workflows)
- The central datastore (currently called REF) that later in the project will collect and hold any kind of evaluation data is not yet fully developed and ready. The development of this datastore is now formally defined as a task in PC.WP1 and when the generic data store is ready for usage developers of workflows, components and the workflow engine Tarverna itself will have to start using it - this is not realistic within the timeframe for year-2 evaluations since the evaluation report D18.1 must be finished in M24 which means that evaluations will have to start in september 2012
This means in practice that evaluation data will for now and for year-2 evaluations be collected manually by evaluators. Since we don't plan for many many evaluation-points with many different metrics this should work OK for now.
The evaluations carried out in year-2 will also give us valuable knowledge about what kind of metrics and measures are relevant for evaluations of SCAPE results and will thus serve as input for actually being able to define requirements to developers of components, workflows and Taverna itself.
This means that an automated approach will be defined and setup as part of the already defined revision of the evaluation methodology. This will according to the DoW be delivered in M33 (MS78 Refined SCAPE evaluation methodology)
The revision of the evaluation methodology will also include integration of evaluation and REF. TB.WP4 will give requirements for how results should be published in REF to be able to use results for systematic evaluation.