The current list of scenarios on the SCAPE Wiki are a valuable resource and present lots of useful problems and ideas for solutions that SCAPE might address. However, it is sometimes difficult to extract the relevant information that would enable a developer to construct an appropriate solution or to ensure that solution can be successfully evaluated. To this end, we would like to refactor the scenarios into a simpler format that capture requirements succinctly and provide a starting point for development as well as a defined set of criteria for evaluation using the existing evaluation framework. We are also trying to ensure that this refactoring does not impact on work already completed that either references or makes use of the existing scenarios. Finally, we do not propose removing the current scenario pages as they are a useful resource, but suggest deprecating them.
On a practical note, I would like to stop use of Confluence "includes" - while a nice idea, they are difficult to work with. Where a scenario is related to something else in the wiki - experiments, evaluation results, functional requirements - simply link to that something else.
This page describes the refactored scenario proposal. Your comments are very welcome!
We're suggesting that we refactor the existing Scenario triples into a simpler format that captures what went before, but should be easier to identify functional requirements and perform evaluations. We're suggesting we have scenarios that present a top-level view of the problem and capture user requirements; and experiments, which combine a dataset, a solution (typically the SCAPE Platform, components and the workflow) and any evaluation criteria, which is usually closely related to the dataset and the dataset owner's needs. Scenarios may have zero or more experiments associated with them.
A Scenario represents a top level summary of the problem written as a user story (an 'epic') and some user requirements. This story should not reference any given organisation, but rather try to generalise the problem such that someone outside of SCAPE could read through the scenarios and see which, if any, could be useful to them. For example, rather than saying the "BL's Web Archive Dataset", it is better to say "A large collection of WARC files". This will allow us to use the scenarios as a showcase for the kinds of problems SCAPE may solve. The user requirements should identify what needs to be done rather than the tools to do it. Development of functional requirements and identification of tools - existing or required - should be done outside of the scenarios and is not part of the test bed scenario development work.
Scenarios should be short and focused. [we're essentially removing the "business needs" discursive parts of the existing scenarios - is that problem?]
We have attempted an example refactoring of scenario WCT1 as an example - [WCT1 Comparison of Web Archive Pages - Refactored Scenario Example]
Each scenario will then have one or more experiments associated with it. An experiment is a real-life application of a scenario, outlining an existing dataset, the business needs of the dataset owner, a workflow (typically Taverna) and the set of evaluation criteria. We are viewing the evaluation criteria as dataset-specific requirements - for example, throughput may not be a big issue when retrospectively migrating an existing collection, but throughput may become paramount when QAing content as it arrives from a digitisation agency where problems need to be identified before the contract ends.
A new experiment should be created if using a new dataset or a different workflow. Provided just one of these changes and the evaluation criteria remain the same, it should be possible to compare evaluations. For example, a single scenario (TIFF to JP2 migration for example) may have multiple experiments with different workflows (process files sequentially, process using Hadoop-only, process using Taverna starting Hadoop jobs, etc.) Provided the Platform used remains constant we can use the experiments and the subsequent evaluation using the evaluation framework to assess the effectiveness of each of these workflows.
In some cases a new dataset will suggest a new scenario and it would be better to create a new scenario and experiment rather than attempt to expand on the scenario in the experiment.
Where Did Solutions Go?
In the existing scenarios there is some effort to capture solutions - tools that exist to address a given problem in a scenario. This is useful information and a good starting point for development, but doesn't quite fit with the test beds providing requirements only, and the components and platform sub-projects developing solutions - we're making an assumption that a tool meets our needs before we have identified those needs.
[So we need to talk to the other sub-projects to ensure they get the idea that they should develop functional requirements and link them to the scenarios?]