Skip to end of metadata
Go to start of metadata

This pages and its children are old pages to be removed at some point

link to TB.WP4 sharepoint site:

Evaluation areas

Level 1: Preservation system

(what to obtain)
Current result How assessed Who assesses
Data preparation (how easy is it to prepare data, how long time does it take)   How much time?   Timing and comments Technical staff that runs the experiment
Effectiveness of preservation (whether the system as a whole is capable of effectively preserving the digital assets against the issues identified in the scenarios)   Storytelling   A subjective judgement of wether the issue is solved Repository owners in cooperation with issue-owners
Performance, scalability
  • Speed (overall run time, objects per second)
  • CPU usage
  • RAM usage
  • Network usage
  • Disk I/O usage
  • Changes in these with number/complexity of digital objects
  Running tests against sample datasets and measuring performance in aspects of interest Automated. Should be output of components/workflows and/or Taverna
System implementation effectiveness (based on the concrete hardware, software and infrastructure setup - proactive monitoring should be done to identify hidden bottlenecks at strategic measurement points. e.g. an HDFS implementation has other critical parameters than a NAS implementation and needs an expert to identify which parameters to monitor proactively. e.g. current Disk Queue length, average Disk Queue length, Disk idle time, Net Packet Received Errors, Connections established, page reads, page writes, etc...)       Implementation of system monitoring at strategic measurement points depending on the infrastructure system implementation System Engineer for a particular system
User experience
  • Ease of interaction for manual tasks
  • Appropriateness of outputs for human inspection
  • Ease of system management
  • Ease of movement from workflowdesign to deploy on a cluster
  Structured questionnaire Users of preservation system (managers, data acquisition staff, …)
Organisational fit (whether the system as a whole integrates well into the operations of the organisation that runs the archive/repository)       Structured questionnaire Repository managers
Industrial/commercial readiness
  • Robustness of software
  • Ease of installation/ configuration/ customisation
  • ...
  Personal judgement supported by evidence Suitable qualified project partners

Level 2: Preservation tools

Comment: This level may need to cover not only individual tools but whole workflows.

How assessed 
Who assesses
Correctness of performance (the extent to which the tool does the job it is supposed to---verification)   Runs over sample datasets, with some extreme cases Tool developers
Robustness, failure rates   Runs over sample datasets, with some extreme cases Tool developers
Performance of tool (distinct from the performance of the preservation system as a whole. A format converter that took 10 minutes per file would probably be of little practical use!)   Running tests against sample data objects and measuring performance Tool developers
Ease of integration (how easily the tool can be integrated into the SCAPE platform)   Personal judgement supported by evidence Testbed partners in conjunction with tool developers


Scenarios <--> Evaluation Areas Matrix

Speed Technical Measures Scalability Robustness Manual Assessment SCAPE Technical Evaluation Integration Evaluation
LSDRT1 Unknown preservation risks in large media files
  X X X X   X
LSDRT2 Digitised TIFFs do not meet storage and access requirements
X   X   X X X
LSDRT3 Do acquired files conform to an agreed technical profile, are they valid and are they complete?
X   X        
LSDRT4 Out-of-sync Sound and Video in wmv to Video Format-X Migration Results
  X X     X  
LSDRT5 Detect audio files with very bad sound quality
  X X X   X  
LSDRT6 Migrate mp3 to wav
X   X   X    
LSDRT7 Characterise and Validate very large video files
X X X X     X
LSDRT9 Characterisation or large amounts of wav audio
X X X X     X
LSDRT10 Camera raw files pose preservation risk
  X     X X  
RDST1 Scientific Data Ingest-related Scenario
X   X X   X  
RDST2 Format Migration of (raw) Scientific Datasets
  X X     X X
RDST3 Maintaining understandability and usability of raw data through external resources              
RDST4 Preserving the value of raw data and verifiability of processed datasets forming part of a scientific workflow   X          
S1 Image based document comparison approach   X          
WCT1a Quantitative comparison of a webharvest and reference harvest   X          
WCT1b Visual comparison of a webharvest and reference harvest   X          
WCT2 ARC to WARC migration   X X X X X X
WCT3 Characterise web content in ARC and WARC containers at State and University Library Denmark   X X X   X  
WCT4 Web Archive Mime-Type detection at Austrian National Library     X X      
WCT5 Pattern based recognition of conspicuous images   X       X  
WCT6 (W)ARC to HBase migration     X X X X X
WCT7 Format obsolescence detection     X X      
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.