h1. {color:#ff0000}This pages and its children are old pages to be removed at some point{color}
link to TB.WP4 sharepoint site:[https://portal.ait.ac.at/sites/Scape/TB/TB.WP.4/default.aspx|https://portal.ait.ac.at/sites/Scape/TB/TB.WP.4/default.aspx]
h1. Evaluation areas
h2. Level 1: Preservation system
|| Goal \\ || Objective \\
(what to obtain) || Metrics \\ || Current result || How assessed || Who assesses \\ ||
| *Data preparation* (how easy is it to prepare data, how long time does it take) | | How much time? | | Timing and comments | Technical staff that runs the experiment |
| *Effectiveness of preservation* (whether the system as a whole is capable of effectively preserving the digital assets against the issues identified in the scenarios) | | Storytelling | | A subjective judgement of wether the issue is solved | Repository owners in cooperation with issue-owners |
| *Performance, scalability* \\ | | * Speed (overall run time, objects per second)
* CPU usage
* RAM usage
* Network usage
* Disk I/O usage
* Changes in these with number/complexity of digital objects | | Running tests against sample datasets and measuring performance in aspects of interest | Automated. Should be output of components/workflows and/or Taverna |
| *System implementation effectiveness* (based on the concrete hardware, software and infrastructure setup - proactive monitoring should be done to identify hidden bottlenecks at strategic measurement points. e.g. an HDFS implementation has other critical parameters than a NAS implementation and needs an expert to identify which parameters to monitor proactively. e.g. current Disk Queue length, average Disk Queue length, Disk idle time, Net Packet Received Errors, Connections established, page reads, page writes, etc...) | | | | Implementation of system monitoring at strategic measurement points depending on the infrastructure system implementation | System Engineer for a particular system |
| *User experience* \\ | | * Ease of interaction for manual tasks
* Appropriateness of outputs for human inspection
* Ease of system management
* Ease of movement from workflowdesign to deploy on a cluster | | Structured questionnaire | Users of preservation system (managers, data acquisition staff, …) |
| *Organisational fit* (whether the system as a whole integrates well into the operations of the organisation that runs the archive/repository) | | | | Structured questionnaire | Repository managers |
| *Industrial/commercial readiness* \\ | | * Robustness of software
* Ease of installation/ configuration/ customisation
* ... \\ | | Personal judgement supported by evidence | Suitable qualified project partners |
h2. Level 2: Preservation tools
_Comment: This level may need to cover not only individual tools but whole workflows._
|| Goal \\ || Aspects \\ || How assessed \\ || Who assesses ||
| *Correctness of performance* (the extent to which the tool does the job it is supposed to---verification) | | Runs over sample datasets, with some extreme cases | Tool developers |
| *Robustness, failure rates* | | Runs over sample datasets, with some extreme cases | Tool developers |
| *Performance of tool* (distinct from the performance of the preservation system as a whole. A format converter that took 10 minutes per file would probably be of little practical use\!) | | Running tests against sample data objects and measuring performance | Tool developers |
| *Ease of integration* (how easily the tool can be integrated into the SCAPE platform) | | Personal judgement supported by evidence | Testbed partners in conjunction with tool developers |
h1.
h1. Scenarios <--> Evaluation Areas Matrix
|| Scenario \\ || Speed || Technical Measures || Scalability || Robustness || Manual Assessment || SCAPE Technical Evaluation || Integration Evaluation ||
| [LSDRT1 Unknown preservation risks in large media files|http://wiki.opf-labs.org/display/SP/LSDRT1+Unknown+preservation+risks+in+large+media+files]\\ | | *X* | *X* | *X* | *X* | | *X* |
| [LSDRT2 Digitised TIFFs do not meet storage and access requirements|http://wiki.opf-labs.org/display/SP/LSDRT2+Digitised+TIFFs+do+not+meet+storage+and+access+requirements]\\ | *X* | | *X* | | *X* | *X* | *X* |
| [LSDRT3 Do acquired files conform to an agreed technical profile, are they valid and are they complete?|http://wiki.opf-labs.org/pages/viewpage.action?pageId=6979761]\\ | *X* | | *X* | | | | |
| [LSDRT4 Out-of-sync Sound and Video in wmv to Video Format-X Migration Results|http://wiki.opf-labs.org/display/SP/LSDRT4+Out-of-sync+Sound+and+Video+in+wmv+to+Video+Format-X+Migration+Results]\\ | | *X* | *X* | | | *X* | |
| [LSDRT5 Detect audio files with very bad sound quality|http://wiki.opf-labs.org/display/SP/LSDRT5+Detect+audio+files+with+very+bad+sound+quality]\\ | | *X* | *X* | *X* | | *X* | |
| [LSDRT6 Migrate mp3 to wav|http://wiki.opf-labs.org/display/SP/LSDRT6+Migrate+mp3+to+wav]\\ | *X* | | *X* | | *X* | | |
| [LSDRT7 Characterise and Validate very large video files|http://wiki.opf-labs.org/display/SP/LSDRT7+Characterise+and+Validate+very+large+video+files]\\ | *X* | *X* | *X* | *X* | | | *X* |
| [LSDRT9 Characterisation or large amounts of wav audio|http://wiki.opf-labs.org/display/SP/LSDRT9+Characterisation+or+large+amounts+of+wav+audio]\\ | *X* | *X* | *X* | *X* | | | *X* |
| [LSDRT10 Camera raw files pose preservation risk|http://wiki.opf-labs.org/display/SP/LSDRT10+Camera+raw+files+pose+preservation+risk]\\ | | *X* | | | *X* | *X* | |
| [RDST1 Scientific Data Ingest-related Scenario|http://wiki.opf-labs.org/display/SP/RDST1+Scientific+Data+Ingest-related+Scenario]\\ | *X* | | *X* | *X* | | *X* | |
| [RDST2 Format Migration of (raw) Scientific Datasets|http://wiki.opf-labs.org/display/SP/RDST2+Format+Migration+of+%28raw%29+Scientific+Datasets]\\ | | *X* | *X* | | | *X* | *X* |
| [RDST3 Maintaining understandability and usability of raw data through external resources|http://wiki.opf-labs.org/display/SP/RDST3+Maintaining+understandability+and+usability+of+raw+data+through+external+resources] | | | | | | | |
| [RDST4 Preserving the value of raw data and verifiability of processed datasets forming part of a scientific workflow|http://wiki.opf-labs.org/display/SP/RDST4+Preserving+the+value+of+raw+data+and+verifiability+of+processed+datasets+forming+part+of+a+scientific+workflow] | | *X* | | | | | |
| [S1 Image based document comparison approach|http://wiki.opf-labs.org/display/SP/S1+Image+based+document+comparison+approach] | | *X* | | | | | |
| [WCT1a Quantitative comparison of a webharvest and reference harvest|http://wiki.opf-labs.org/display/SP/WCT1a+Quantitative+comparison+of+a+webharvest+and+reference+harvest] | | *X* | | | | | |
| [WCT1b Visual comparison of a webharvest and reference harvest|http://wiki.opf-labs.org/display/SP/WCT1b+Visual+comparison+of+a+webharvest+and+reference+harvest] | | *X* | | | | | |
| [WCT2 ARC to WARC migration|http://wiki.opf-labs.org/display/SP/WCT2+ARC+to+WARC+migration] | | *X* | *X* | *X* | *X* | *X* | *X* |
| [WCT3 Characterise web content in ARC and WARC containers at State and University Library Denmark|http://wiki.opf-labs.org/display/SP/WCT3+Characterise+web+content+in+ARC+and+WARC+containers+at+State+and+University+Library+Denmark] | | *X* | *X* | *X* | | *X* | |
| [WCT4 Web Archive Mime-Type detection at Austrian National Library|http://wiki.opf-labs.org/display/SP/WCT4+Web+Archive+Mime-Type+detection+at+Austrian+National+Library] | | | *X* | *X* | | | |
| [WCT5 Pattern based recognition of conspicuous images|http://wiki.opf-labs.org/display/SP/WCT5+Pattern+based+recognition+of+conspicuous+images] | | *X* | | | | *X* | |
| [WCT6 (W)ARC to HBase migration|http://wiki.opf-labs.org/display/SP/WCT6+%28W%29ARC+to+HBase+migration] | | | *X* | *X* | *X* | *X* | *X* |
| [WCT7 Format obsolescence detection|http://wiki.opf-labs.org/display/SP/WCT7+Format+obsolescence+detection] | | | *X* | *X* | | | |
link to TB.WP4 sharepoint site:[https://portal.ait.ac.at/sites/Scape/TB/TB.WP.4/default.aspx|https://portal.ait.ac.at/sites/Scape/TB/TB.WP.4/default.aspx]
h1. Evaluation areas
h2. Level 1: Preservation system
|| Goal \\ || Objective \\
(what to obtain) || Metrics \\ || Current result || How assessed || Who assesses \\ ||
| *Data preparation* (how easy is it to prepare data, how long time does it take) | | How much time? | | Timing and comments | Technical staff that runs the experiment |
| *Effectiveness of preservation* (whether the system as a whole is capable of effectively preserving the digital assets against the issues identified in the scenarios) | | Storytelling | | A subjective judgement of wether the issue is solved | Repository owners in cooperation with issue-owners |
| *Performance, scalability* \\ | | * Speed (overall run time, objects per second)
* CPU usage
* RAM usage
* Network usage
* Disk I/O usage
* Changes in these with number/complexity of digital objects | | Running tests against sample datasets and measuring performance in aspects of interest | Automated. Should be output of components/workflows and/or Taverna |
| *System implementation effectiveness* (based on the concrete hardware, software and infrastructure setup - proactive monitoring should be done to identify hidden bottlenecks at strategic measurement points. e.g. an HDFS implementation has other critical parameters than a NAS implementation and needs an expert to identify which parameters to monitor proactively. e.g. current Disk Queue length, average Disk Queue length, Disk idle time, Net Packet Received Errors, Connections established, page reads, page writes, etc...) | | | | Implementation of system monitoring at strategic measurement points depending on the infrastructure system implementation | System Engineer for a particular system |
| *User experience* \\ | | * Ease of interaction for manual tasks
* Appropriateness of outputs for human inspection
* Ease of system management
* Ease of movement from workflowdesign to deploy on a cluster | | Structured questionnaire | Users of preservation system (managers, data acquisition staff, …) |
| *Organisational fit* (whether the system as a whole integrates well into the operations of the organisation that runs the archive/repository) | | | | Structured questionnaire | Repository managers |
| *Industrial/commercial readiness* \\ | | * Robustness of software
* Ease of installation/ configuration/ customisation
* ... \\ | | Personal judgement supported by evidence | Suitable qualified project partners |
h2. Level 2: Preservation tools
_Comment: This level may need to cover not only individual tools but whole workflows._
|| Goal \\ || Aspects \\ || How assessed \\ || Who assesses ||
| *Correctness of performance* (the extent to which the tool does the job it is supposed to---verification) | | Runs over sample datasets, with some extreme cases | Tool developers |
| *Robustness, failure rates* | | Runs over sample datasets, with some extreme cases | Tool developers |
| *Performance of tool* (distinct from the performance of the preservation system as a whole. A format converter that took 10 minutes per file would probably be of little practical use\!) | | Running tests against sample data objects and measuring performance | Tool developers |
| *Ease of integration* (how easily the tool can be integrated into the SCAPE platform) | | Personal judgement supported by evidence | Testbed partners in conjunction with tool developers |
h1.
h1. Scenarios <--> Evaluation Areas Matrix
|| Scenario \\ || Speed || Technical Measures || Scalability || Robustness || Manual Assessment || SCAPE Technical Evaluation || Integration Evaluation ||
| [LSDRT1 Unknown preservation risks in large media files|http://wiki.opf-labs.org/display/SP/LSDRT1+Unknown+preservation+risks+in+large+media+files]\\ | | *X* | *X* | *X* | *X* | | *X* |
| [LSDRT2 Digitised TIFFs do not meet storage and access requirements|http://wiki.opf-labs.org/display/SP/LSDRT2+Digitised+TIFFs+do+not+meet+storage+and+access+requirements]\\ | *X* | | *X* | | *X* | *X* | *X* |
| [LSDRT3 Do acquired files conform to an agreed technical profile, are they valid and are they complete?|http://wiki.opf-labs.org/pages/viewpage.action?pageId=6979761]\\ | *X* | | *X* | | | | |
| [LSDRT4 Out-of-sync Sound and Video in wmv to Video Format-X Migration Results|http://wiki.opf-labs.org/display/SP/LSDRT4+Out-of-sync+Sound+and+Video+in+wmv+to+Video+Format-X+Migration+Results]\\ | | *X* | *X* | | | *X* | |
| [LSDRT5 Detect audio files with very bad sound quality|http://wiki.opf-labs.org/display/SP/LSDRT5+Detect+audio+files+with+very+bad+sound+quality]\\ | | *X* | *X* | *X* | | *X* | |
| [LSDRT6 Migrate mp3 to wav|http://wiki.opf-labs.org/display/SP/LSDRT6+Migrate+mp3+to+wav]\\ | *X* | | *X* | | *X* | | |
| [LSDRT7 Characterise and Validate very large video files|http://wiki.opf-labs.org/display/SP/LSDRT7+Characterise+and+Validate+very+large+video+files]\\ | *X* | *X* | *X* | *X* | | | *X* |
| [LSDRT9 Characterisation or large amounts of wav audio|http://wiki.opf-labs.org/display/SP/LSDRT9+Characterisation+or+large+amounts+of+wav+audio]\\ | *X* | *X* | *X* | *X* | | | *X* |
| [LSDRT10 Camera raw files pose preservation risk|http://wiki.opf-labs.org/display/SP/LSDRT10+Camera+raw+files+pose+preservation+risk]\\ | | *X* | | | *X* | *X* | |
| [RDST1 Scientific Data Ingest-related Scenario|http://wiki.opf-labs.org/display/SP/RDST1+Scientific+Data+Ingest-related+Scenario]\\ | *X* | | *X* | *X* | | *X* | |
| [RDST2 Format Migration of (raw) Scientific Datasets|http://wiki.opf-labs.org/display/SP/RDST2+Format+Migration+of+%28raw%29+Scientific+Datasets]\\ | | *X* | *X* | | | *X* | *X* |
| [RDST3 Maintaining understandability and usability of raw data through external resources|http://wiki.opf-labs.org/display/SP/RDST3+Maintaining+understandability+and+usability+of+raw+data+through+external+resources] | | | | | | | |
| [RDST4 Preserving the value of raw data and verifiability of processed datasets forming part of a scientific workflow|http://wiki.opf-labs.org/display/SP/RDST4+Preserving+the+value+of+raw+data+and+verifiability+of+processed+datasets+forming+part+of+a+scientific+workflow] | | *X* | | | | | |
| [S1 Image based document comparison approach|http://wiki.opf-labs.org/display/SP/S1+Image+based+document+comparison+approach] | | *X* | | | | | |
| [WCT1a Quantitative comparison of a webharvest and reference harvest|http://wiki.opf-labs.org/display/SP/WCT1a+Quantitative+comparison+of+a+webharvest+and+reference+harvest] | | *X* | | | | | |
| [WCT1b Visual comparison of a webharvest and reference harvest|http://wiki.opf-labs.org/display/SP/WCT1b+Visual+comparison+of+a+webharvest+and+reference+harvest] | | *X* | | | | | |
| [WCT2 ARC to WARC migration|http://wiki.opf-labs.org/display/SP/WCT2+ARC+to+WARC+migration] | | *X* | *X* | *X* | *X* | *X* | *X* |
| [WCT3 Characterise web content in ARC and WARC containers at State and University Library Denmark|http://wiki.opf-labs.org/display/SP/WCT3+Characterise+web+content+in+ARC+and+WARC+containers+at+State+and+University+Library+Denmark] | | *X* | *X* | *X* | | *X* | |
| [WCT4 Web Archive Mime-Type detection at Austrian National Library|http://wiki.opf-labs.org/display/SP/WCT4+Web+Archive+Mime-Type+detection+at+Austrian+National+Library] | | | *X* | *X* | | | |
| [WCT5 Pattern based recognition of conspicuous images|http://wiki.opf-labs.org/display/SP/WCT5+Pattern+based+recognition+of+conspicuous+images] | | *X* | | | | *X* | |
| [WCT6 (W)ARC to HBase migration|http://wiki.opf-labs.org/display/SP/WCT6+%28W%29ARC+to+HBase+migration] | | | *X* | *X* | *X* | *X* | *X* |
| [WCT7 Format obsolescence detection|http://wiki.opf-labs.org/display/SP/WCT7+Format+obsolescence+detection] | | | *X* | *X* | | | |