|
Key
This line was removed.
This word was removed. This word was added.
This line was added.
|
Comment:
Changes (1)
View Page History

h5. About these pages
The pages referenced below form a network of *Datasets*, preservation and curation *Issues* with those Datasets, and *Solutions* to those Issues. As such, these pages capture information and requirements about concrete digital preservation and curation challenges, that are present in specific datasets and collections. The experiences of solving these Issues are written up on Solution pages. These in turn link to pages in the [OPF Tool Registry|TR:Home], and to actual code that can be downloaded and re-used.
The purpose of these pages is to share experiences in solving preservation and curation problems, so we can learn from each other, and to articulate practitioners needs and requirements to those who are in a position to produce practical solutions to their problems.
h5. Support
The work collated on this page is supported by: [Open Planets Foundation|http://openplanetsfoundation.org/], Preservation Foundation|http://openpreservation.org/], [Jisc|http://www.jisc.ac.uk/], [European Commission|http://cordis.europa.eu/fp7/home_en.html], [Digital Preservation Coalition|http://www.dpconline.org/], [SPRUCE Project|http://wiki.opf-labs.org/display/SPR/Home], [AQuA Project|http://wiki.opf-labs.org/display/AQuA/Home], [SCAPE Project|http://www.scape-project.eu/], and you\!
{column}
{column:width=50%}
{column:width=50%}


{html}<a href="http://www.dcc.ac.uk/sites/default/files/documents/idcc13posters/Poster186.pdf"><img align="right" src="http://wiki.opf-labs.org/download/attachments/13764153/DCC+2012+poster.png"></a>{html}
h5. Practitioners need better characterisation tools
Analysis of the Datasets, Issues and Solutions collated on this page indicated a broad cross section of preservation requirements, but an overriding need for more effective characterisation. Practitioners need to understand more about their data and it's condition, typically for quality assurance, appraisal and assessment and for identifying preservation risks. This analysis and details of these conclusions are described in this poster, published at the 8th International Digital Curation Conference, Amsterdam, January 2013.
h5. Get involved
Anyone can contribute to these pages. All you need to do is [register for an OPF account|KB:Joining the OPF Labs site] (its quick, free, and anyone can do it), and then start adding comments, adding value to existing pages, or contributing new ones. Please help us make this a valuable resource for all\!
{column}{section}
{section}{column}
h1. Datasets
!dataset2.png|align=left!
These are the Datasets or collections that relate to specific preservation Issues which in turn (may) have Solutions developed for them. The Datasets are categorised by their media type. [Click this link to create a new Dataset, then edit the italicised text|http://wiki.opf-labs.org/pages/createpage-entervariables.action?spaceKey=REQ&templateId=8617991&fromPageId=8356148].
----
h4. Audio datasets
Label: [audio|http://wiki.opf-labs.org/label/audio]
{contentbylabel:labels=+dataset +audio|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Disk image datasets
Label: [disk_image|http://wiki.opf-labs.org/label/disk_image]
{contentbylabel:labels=+dataset +disk_image|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Document datasets
Label: [document|http://wiki.opf-labs.org/label/document]
{contentbylabel:labels=+dataset +document|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Email datasets
Label: [email|http://wiki.opf-labs.org/label/email]
{contentbylabel:labels=+dataset +email|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Geodatasets
Label: [geodatasets|http://wiki.opf-labs.org/label/geodatasets]
{contentbylabel:labels=+dataset +geodata|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Image datasets
Label: [image|http://wiki.opf-labs.org/label/image]
{contentbylabel:labels=+dataset +image|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Mixed/Misc datasets
Label: [mixed_misc|http://wiki.opf-labs.org/label/mixed_misc]
{contentbylabel:labels=+dataset +mixed_misc|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Research datasets
Label: [researchdata|http://wiki.opf-labs.org/label/researchdata]
{contentbylabel:labels=+dataset +researchdata|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Software datasets
Label: [software|http://wiki.opf-labs.org/label/software]
{contentbylabel:labels=+dataset +software|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Web datasets
Label: [web|http://wiki.opf-labs.org/label/web]
{contentbylabel:labels=+dataset +web|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Video datasets
Label: [video|http://wiki.opf-labs.org/label/video]
{contentbylabel:labels=+dataset +video|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Other / untagged
{contentbylabel:labels=+dataset -image -document -audio -video -software -email -web -researchdata -disk_image -geodata -mixed_misc|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
{column}
{column}
h1. Issues
!issue3.png|align=left!
These are the preservation or other business driven Issues that relate to a specific Dataset and may have one or more specific Solutions developed to solve them. The aim of an Issue page is to provide a detailed description of the preservation challenge and the requirements of the Issue Owner that will help to inform development of a Solution that solves the Issue.
[Click this link to create a new Issue, then edit the italicised text|http://wiki.opf-labs.org/pages/createpage-entervariables.action?spaceKey=REQ&templateId=8617992&fromPageId=8356152].
----
h4. Unsolved issues
Issues that do not have linked solutions. Why not suggest or contribute a solution?
Label: [unsolved_issue|http://wiki.opf-labs.org/label/unsolved_issue]
{contentbylabel:labels=+issue +unsolved_issue|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Appraisal and assessment issues
Issues related to appraising or assessing digital content as the first step in deciding how to proceed with preservation activities.
Label: [appraisal_assessment|http://wiki.opf-labs.org/label/appraisal_assessment]
{contentbylabel:labels=+issue +appraisal_assessment|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Bit rot issues
Issues related to Datasets that exhibit bit rot (files damaged by imperfect storage, failed write operations or software/processing errors) and require a Solution to identify, and if possible repair, problematic files.
Label: [bit_rot|http://wiki.opf-labs.org/label/bit_rot]
{contentbylabel:labels=+issue +bit_rot|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Conformance issues
Issues where Dataset content does not match a required profile, or needs to be checked or validated against a particular profile. These profiles are typically determined by an organisation's collection or preservation policy.
Label: [conformance|http://wiki.opf-labs.org/label/conformance]
{contentbylabel:labels=+issue +conformance|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Contextual issues
Issues related to the wider context of a particular Dataset.
Label: [context|http://wiki.opf-labs.org/label/context]
{contentbylabel:labels=+issue +context|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Data capture issues
Issues related to the capture, harvesting or extraction of data in order to facilitate effective preservation and access.
Label: [data_capture|http://wiki.opf-labs.org/label/data_capture]
{contentbylabel:labels=+issue +data_capture|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Duplication issues
Duplicated files can arise from a number of causes. Identical duplicates are relatively easy to detect. Similar duplicates (eg. one file processed from another, or the same item scanned on a different device) can require much more complicated Solutions.
Label: [duplication|http://wiki.opf-labs.org/label/duplication]
{contentbylabel:labels=+issue +duplication|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Embedded objects issues
Objects embedded within other objects (such as OLE, OOXML, PDF, ZIP) can pose identification, appraisal or risk assessment challenges.
Label: [embedded_objects|http://wiki.opf-labs.org/label/embedded_objects]
{contentbylabel:labels=+issue +embedded_objects|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. External dependency issues
Issues relating to digital objects that have dependencies on other objects or content on the web.
Label: [dependency|http://wiki.opf-labs.org/label/dependency]
{contentbylabel:labels=+issue +dependency|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Integrity issues
Issues relating to ensuring the integrity or fixity of Datasets.
Label: [integrity|http://wiki.opf-labs.org/label/integrity]
{contentbylabel:labels=+issue +integrity|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Obsolescence, preservation risk and business constraint issues
Issues that relate to the obsolescence of Datasets, preservation risk or business constraints placed on the way that Datasets are managed.
Label: [obsolescence|http://wiki.opf-labs.org/label/obsolescence]
{contentbylabel:labels=+issue +obsolescence|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Planning and management issues
Issues that relate to the general planning and management of digital preservation.
Label: [planning_management|http://wiki.opf-labs.org/label/planning_management]
{contentbylabel:labels=+issue +planning_management|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Quality issues
Issues relating to Datasets containing quality issues caused by digitisation, processing or format migration.
Label: [qa|http://wiki.opf-labs.org/label/qa]
{contentbylabel:labels=+issue +qa|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Retention/disposal issues
Issues relating to the retention, disposal and/or deletion of digital objects.
Label: [retention|http://wiki.opf-labs.org/label/retention]
{contentbylabel:labels=+issue +retention|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Rights issues
Issues related to rights or permissions that cause difficulties in managing or preserving Datasets.
Label: [rights|http://wiki.opf-labs.org/label/rights]
{contentbylabel:labels=+issue +rights|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Structural relationship issues
Digital entities can be made up of a number of objects (eg. masters, services copies, metadata). Structural relationships are important to understand which objects are part of an entity and what they for.
Label: [structural_relationships|http://wiki.opf-labs.org/label/structural_relationships]
{contentbylabel:labels=+issue +structural_relationships|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. System obsolescence issues
Issues related to the obsolescence of software or other systems that manage Datasets.
Label: [system_obsolescence|http://wiki.opf-labs.org/label/system_obsolescence]
{contentbylabel:labels=+issue +system_obsolescence|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Unknown characteristics issues
Issues related to Datasets with unknown characteristics that are necessary for a preservation, management or other business need.
Label: [unknown_characteristics|http://wiki.opf-labs.org/label/unknown_characteristics]
{contentbylabel:labels=+issue +unknown_characteristics|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Unknown file formats issues
Datasets containing unknown file formats tend to pose a preservation risk and make management of them difficult.
Label: [unknown_file_formats|http://wiki.opf-labs.org/label/unknown_file_formats]
{contentbylabel:labels=+issue +unknown_file_formats|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Value and cost issues
Issues relating to the cost of Dataset management or the Value of the Dataset to its owners and users.
Label: [value_cost|http://wiki.opf-labs.org/label/value_cost]
{contentbylabel:labels=+issue +value_cost|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Other / untagged
Issues that haven't been tagged with any of the labels listed in this column. This provides a useful mechanism for catching Issues that have not been tagged with sufficient detail, or identifying the need to add new labels to this page.
{contentbylabel:labels=+issue -appraisal_assessment -obsolescence -context -dependency -conformance -system_obsolescence -retention -rights -qa -unknown_characteristics -unknown_file_formats -integrity -embedded_objects -duplication -data_capture -bit_rot -planning_management -unsolved_issue -structural_relationships|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
{column}
{column}
h4.
h1. Solutions
!solution2.png|align=left!
These are Solutions that address specific Issues encountered in particular Datasets. Solutions are typically quite specific to a particular Issue and Dataset but many will have a wider application. For details of tools utilised in a Solution, either follow links from individual Solution pages or see the [Tool Registry|TR:Digital Preservation Tool Registry].
[Click this link to create a new Solution, then edit the italicised text|http://wiki.opf-labs.org/pages/createpage-entervariables.action?spaceKey=REQ&templateId=8617994&fromPageId=8356159].
----
h4. Appraisal and assessment solutions
Solutions for assessing or appraising datasets.
Label: [appraisal_assessment|http://wiki.opf-labs.org/label/appraisal_assessment]
{contentbylabel:labels=+solution +appraisal_assessment|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Bit rot detection and repair solutions
Solutions for detecting and possibly repairing bit rot Issues in Datasets.
Label: [bit_rot_detection|http://wiki.opf-labs.org/label/bit_rot_detection]
{contentbylabel:labels=+solution +bit_rot_detection|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Characterisation solutions
Solutions for characterising content.
Label: [characterisation|http://wiki.opf-labs.org/label/characterisation]
{contentbylabel:labels=+solution +characterisation|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Data capture solutions
Solutions for capturing data from an external source, or imaging data from hand held media.
Label: [data_capture|http://wiki.opf-labs.org/label/data_capture]
{contentbylabel:labels=+solution +data_capture|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. De-duplication solutions
Solutions for detecting and managing duplicated digital objects or datasets.
Label: [de-duplication|http://wiki.opf-labs.org/label/de-duplication]
{contentbylabel:labels=+solution +de-duplication|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Embedded object solutions
Solutions for managing and preserving embedded digital objects.
Label: [embedded_objects|http://wiki.opf-labs.org/label/embedded_objects]
{contentbylabel:labels=+solution +embedded_objects|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Emulation solutions
Solutions utilising emulation or virtualisation technologies.
Label: [emulation|http://wiki.opf-labs.org/label/emulation]
{contentbylabel:labels=+solution +emulation|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. File format identification solutions
Solutions for identifying file formats.
Label: [identification|http://wiki.opf-labs.org/label/identification]
{contentbylabel:labels=+solution +identification|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Fixity solutions
Solutions for addressing integrity issues using approaches for generating and verifying fixity information such as manifests and checksums.
Label: [fixity|http://wiki.opf-labs.org/label/fixity]
{contentbylabel:labels=+solution +fixity|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Migration solutions
Solutions for migrating data from one format to another.
Label: [migration|http://wiki.opf-labs.org/label/migration]
{contentbylabel:labels=+solution +migration|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Miscellaneous solutions
Solutions for miscellaneous topics.
Label: [miscellaneous|http://wiki.opf-labs.org/label/miscellaneous]
{contentbylabel:labels=+solution +miscellaneous|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Quality assurance solutions
Solutions for assessing or identifying quality Issues in Datasets.
Label: [quality_assurance|http://wiki.opf-labs.org/label/quality_assurance]
{contentbylabel:labels=+solution +quality_assurance|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Rights managment solutions
Solutions for managing permissions and rights Issues.
Label: [rights|http://wiki.opf-labs.org/label/rights]
{contentbylabel:labels=+solution +rights|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Structural relationship solutions
Solutions for preserving or checking the structural relationships between digital objects belonging to a particular entity.
Label: [structural_relationships|http://wiki.opf-labs.org/label/structural_relationships]
{contentbylabel:labels=+solution +structural_relationships|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Validation solutions
Solutions for validating the conformance of digital objects to file format specifications or institutional profiles.
Label: [validation|http://wiki.opf-labs.org/label/validation]
{contentbylabel:labels=+solution +validation|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
h4. Other / untagged
Solutions that haven't been tagged with any of the labels listed in this column. This provides a useful mechanism for catching Issues that have not been tagged with sufficient detail, or identifying the need to add new labels to this page.
{contentbylabel:labels=+solution -appraisal_assessment -bit_rot_detection -characterisation -data_capture -de-duplication -embedded_objects -emulation -fixity -identification -miscellaneous -migration -quality_assurance -rights -structural_relationships -validation|showLabels=false|showSpace=false|max=999|sort=modified|reverse=true|[email protected]}
{column}{column}
{column}
{section}
{recently-updated:[email protected]|labels=dataset issue solution}