Evaluator(s)
Rune Ferneke-Nielsen, SB
Purpose
Please note that the evaluation work has not been completed, and the numbers for the second iteration are missing.
In this second evaluation, Evaluation 2, of the policy driven validation experiment, focus is on measuring performance of extracting and storing metadata from our Fedora-based repository. As mentioned in the first evaluation, image files are accessed via a NFS mount, and this will not be change in the second evaluation. This setup is a valid environment configuration at SB, and it will normally require manual intervention to get the files placed and available at the NFS mount.
We might want to use a multi-threaded approach to gain more performance, though at the same time we do not want to overload the repository with too many simultaneous requests. When we have performed the first set of tests, we should know, whether it is required.
Evaluation points
Assessment of measurable points
Metric | Description | Metric baseline | Metric goal | ? |
---|---|---|---|---|
NumberOfObjectsPerHour | Performance efficiency - Capacity / Time behaviour Number of METS documents (each representing a newspaper page) being extracted from repository |
? | >5000 | ? |
NumberOfObjectsPerHour | Performance efficiency - Capacity / Time behaviour Number of METS documents (each representing a newspaper page) being stored in repository |
? | >5000 | ? |
Assessment of non-measurable points
Reliability - Stability indicators
- The software packages that handle communication with the repository, extracting and storing METS document, is implemented as a first version. It is not used in a production environment, and may require several cycles before being sufficiently stable (concern).
- The metadata repository at SB is used as a production system and has a user community in [www.fedora-commons.org]
Functional suitability - Correctness
- The software packages that handle communication with the repository may need further testing and use, before a certain confidence is achieved (concern).
Maintainability - Reusability
- Currently, this is not meant to be reusable - it is specialised to fit data model and interfaces for our repository. (no concern).
Commercial readiness
- It is not ready 'as is' and will probably require some development effort.
Technical details
Remember to include relevant information, links, versions about workflow, tools, APIs (e.g. Taverna, command line, Hadoop, links to MyExperiment, link to tools or SCAPE name, links to distinct versions of specific components/tools in the component registry)
Implementation
The implementation can be found at github: statsbiblioteket/scape-stager-loader