Welcome to the SCAPE Public Wiki
Getting startedTo get started with the SCAPE wiki, you should first follow the instructions at: Joining the OPF Labs site. Then, if you are a SCAPE project member, you should get in touch with the people listed on this SCAPE SharePoint announcement
Navigate space |
Popular labels
- arc
- azure
- characterisation
- conformance
- context
- dataset
- document
- fedora-commons
- fixity
- formatprofile
- hadoop
- identification
- image
- initial
- integrity
- issue
- jpeg2000
- lsdr
- lsdrscenario
- microsoft
- migration
- miscellaneous
- obsolescence
- page
- pc
- planning
- qa
- quality_assurance
- rdscenarios
- repair
- representationinformation
- researchdata
- scape
- scenario
- solution
- structural_relationships
- system_obsolescence
- taverna
- training
- unknown_characteristics
- unknown_file_formats
- untagged
- validation
- value_cost
- watch
- web
- webarchive
- workflow
Index
- Technical Coordinator
- Events
- SCAPE Scenario Workshop, 1-3 February 2012, Portugal
- SCAPE-Taverna hackathon 16 Nov 2011 Manchester
- First SCAPE Developer Workshop
- SCAPEdev1 Agenda
- 2011-02 Vienna Kick-Off Developer Workshop
- SCAPE Training Event - Future Formats First, Building Applications Infrastructure for Action Services
- SCAPE Training event - Keeping Control - Scalable Environments for Identification and Characterisation
- SCAPE Training Event - Effective, Evidence-Based Preservation Planning
- SCAPE & OPF Hackathon, Hadoop-Driven Digital Preservation
- SCAPE Training Event - Preserving Your Preservation Tools
- Managing Digital Preservation - A SCAPE & OPF Executive Seminar
- Final Developer Workshop Ideas
- Digital Preservation Sustainability on the EU Policy Level
- SCAPE Virtual Hackathons
- SCAPE Developer's Guide
- SCAPE Members Area
- ERCIM News 86 Article
- Integration
- Preservation Components, Workflows, Planning and Platform integration
- SB-CaseStudy with TUW
- SCAPE Audiences
- SCAPE Evaluation Methodology
- SCAPE Technical Overview
- SCAPE Training Events Planning
- SCAPE Training - Aarhus
- SCAPE Training - Guimarães
- Agenda - Keeping Control - Scalable Preservation Environments for Identification, Characterisation and Validation
- Agenda - Sustaining Digital Culture 2012
- Guimarães Training Materials
- Session Plan - Scalable Preservation Environments for Identification and Characterisation
- Session Plan - Sustaining Digital Culture
- SCAPE Training - London
- SCAPE Training - The Hague
- Session plan example
- SCAPE User-Group
- Success Stories
- Detecting duplicates on large collections of digitized book pages
- Enhancement of cloud platform for interoperability
- Ensuring valuable material is not lost during migration
- Improving digitized radio recordings while keeping them authentic!
- Jpylyzer usage on image content
- let SCOUT be your preservation guide
- Preservation planning made fast and easy
- QA and Characterisation of Web Content
- Success Story Template
- TIFF to JP2 Migration
- Technical Architecture Report
- Tool Development Status
- Work Package Status Pages
- SCAPE Take Up - Final Year
- ZZ-Deprecated
- SCAPE Daily Virtual Stand-Up Technical Meeting
- SCAPE IRC Channels
- SCAPE Platform in Practice
- SCAPE Scenarios - Datasets, Issues and Solutions
- Issue Overview
- Scenarios
- LSDR Scenarios
- LSDRT1 Assessing preservation risks in large media files
- LSDRT2 Validating files migrated from TIFF to JPEG2000
- LSDRT3 Validating Migrated Images 'Visually'
- LSDRT4 Out-of-sync Sound and Video in wmv to Video Format-X Migration Results
- LSDRT5 Detecting audio files with very bad sound quality
- LSDRT6 Large scale migration from mp3 to wav
- LSDRT7 Characterise very large video files
- LSDRT9 Characterisation of large amounts of wav audio
- LSDRT10 Capturing Representation Information from original image files
- LSDRT11 Duplicate image detection within one book
- LSDRT12 Quality assurance in redownload workflows of digitised books
- LSDRT13 Potential bit rot in image files that were stored on CD
- LSDRT14 Audio and Video Recordings Missing Important Metadata
- LSDRT15 Assessing preservation risks in PDF files
- LSDRT16 Evaluate preservation risks from FFProbe and Manzanita Crosscheck characterisation information
- LSDRT Scenarios Completed
- LSDRT17 Perform policy driven validation on Audio+video data
- LSDRT18 Large scale digital book ingest
- LSDRT19 Ingest using hosted instance of ExLibris Rosetta
- Research Dataset Scenarios
- RDST1 General Scientific Data Handling Scenarios
- RDST2 Format Migration of (raw) Scientific Datasets
- RDST3 Maintaining understandability and usability of raw data through external resources
- RDST4 Preserving the value of raw data and verifiability of processed datasets forming part of a scientific workflow
- Web Content Scenarios
- Moved to UserStories… WCT3 Characterise web content in ARC and WARC containers at State and University Library Denmark
- WCT1 Comparison of Web Archive pages
- WCT2 ARC to WARC migration
- WCT4 Web Archive Mime-Type detection at Austrian National Library
- WCT6 (W)ARC to HBase migration
- WCT7 Format obsolescence detection
- WCT8 Huge text file analysis using hadoop
- LSDR Scenarios
- Datasets
- State and University Library Denmark - Danish National Heritage Video, Audio, and Image Collections
- British Library - Books & Newspapers Collections
- National Library of the Netherlands - Image Repository Content
- Austrian National Library - Web Archive
- British Library - International Dunhuang Project Manuscripts
- State and University Library Denmark - Web Archive Data
- Govdocs1 Open Corpus
- STFC Scientific Datasets
- British Library - Research Datasets
- KB Open Access Journals PDFs
- Camera raw file images
- Internet Memory Web Archive
- Issues
- IS1 Digitised TIFFs do not meet storage and access requirements
- IS2 Do acquired files conform to an agreed technical profile, are they valid and are they complete?
- IS3 Large media files are difficult to characterise without mass processing + We cannot identify preservation risks in uncharacterised files
- IS5 Digital objects archive contains unidentified content
- IS6 Determine render-ability of displayable web objects
- IS7 Incompleteness and and inconsistency of web archive data
- IS8 Diversity of office document formats in digital objects archive
- IS9 Archive system migration preserving and enriching AIPs
- IS10 Potential bit rot in image files that were stored on CD
- IS11 PDF files may face preservation risks
- IS12 ARC to WARC migration
- IS13 wmv to Video Format-X Migration Results in Out-of-sync Sound and Video
- IS14 Diverse preservation risks in large archives with millions of objects
- IS15 Long-term access and decoding of JP2 images
- IS16 Normalisation of JPEG 2000 images
- IS17 Characterisation of text-based formats
- IS18 Verify bitstream integrity
- IS19 Migrate whole archive to new archiving system
- IS20 Detect audio files with very bad sound quality
- IS21 Migration of mp3 to wav
- IS22 Characterise and Validate very large mpeg-1 and mpeg-2 files
- IS24 Characterisation of large amounts of wav audio
- IS25 Web Content Characterisation
- IS26 Dealing with difficult identification cases
- IS27 Quality assurance in redownload workflows of digitised books
- IS28 Structural and visual comparisons for web page archiving
- IS29 Characterisation and validation of very large data files
- IS30 Fixity capturing and checking of very large data files
- IS31 Semantic checking of very large data files
- IS32 Basic Migration of RAW to NeXus data
- IS33 Enhanced migration of RAW to NeXus data
- IS34 ISIS instrument website no longer applicable or available
- IS35 Mantid website or software no longer applicable or available
- IS36 Examine the long term value of the preserved datasets
- IS37 Preserving the verifiability and provenance of processed datasets
- IS38 (W)ARC to HBASE migration
- IS39 Format obsolescence detection
- IS40 Complexity of camera raw files
- IS41 Analyse huge text files containing information about a web archive
- IS42 Detecting Encryption and DRM in Digital Content
- IS43 Determining general 'document' properties
- IS44 Migrated image metadata must map or match to those of the original
- IS45 Audio and Video Recordings have unreliable broadcast time information
- IS46 Book page image duplicate detection within one book
- IS47 Identify Preservation Risks from audio+video characterisation information
- IS48 Validate archival files against an institutional content policy regarding formats
- IS49 Large scale ingest of a large book collection
- Solutions
- SO1 Simple JP2 file structure checker
- SO2 xcorrSound QA audio comparison tool
- SO3 Comparing identification tools
- SO4 Audio mp3 to wav Migration and QA Workflow
- SO5 Video Migration and QA
- SO06 Use Ffprobe to characterise audio+video
- SO07 Develop Warc Unpacker
- SO8 QA for TIFF to JP2K conversion (image comparison tool based on histograms and profiles)
- SO9 Matchbox - Image comparison tool based on bag-of-(visual-)words matching
- SO10 QA for TIFF to correspondent JP2K comparison (image comparison tool based on SIFT-matching)
- SO11 The Tika characterisation Tool
- SO12 Tool testing framework
- SO14 Fuse mounting (w)arc files
- SO15 JP2 validator and properties extractor
- SO16 QA for estimation of affine transformation (image comparison tool based on SSIM algorithm)
- SO17 Web Archive Mime-Type detection workflow based on Droid and Apache Tika
- SO18 Comparing two web page versions for web archiving
- SO19 Recognize inaccurate graphical image files based on a pattern-set
- SO20 Extending JHOVE to characterise NeXus data format
- SO21 Extending the NeXus validation toolkit to cope with very large data files
- SO22 Developing a Raw-to-NeXus migration tool
- SO23 Pushing additional metadata into NeXus metadata fields
- SO24 Use Preservation Network Model to record "deep" dependencies and to allow tracking over time
- SO25 Rosetta v3.0 Implementation Integrated with DROID 6, JHOVE1, NLNZ tool and more...
- SO26 Automated RAW to DNG migration+QA
- SO27 Analyse huge text files containing information about a web archive using Hadoop
- SO28 A heuristic measure for detecting undesired influence of lossy JP2 compression on OCR in the absence of ground truth
- SO29 Extending JHOVE to characterise very large NeXus data file
- SO30 Automated assessment of JP2 against a technical profile
- SO31 Preservation Grade TIFF to JPEG2000 Migration
- SO32 Image Metadata Extractor
- SO33 Image Metadata Compare
- SO34 Use Manzanita Crosscheck to validate mpeg transport streams
- SO35 Use schematron as the content profile language to validate files by evaluating their characterisation information
- SO36 Perform scalable search for small sound chunks in large audio archive
- SO37 Connector API Technical Compability Kit
- How to add a new Dataset, Issue, Solution or Scenario
- Responsibilities of the roles described on these pages
- SCAPE Platform installations
- Preparation DL2014 and review demos wiki page
- Examples of working and best practice
- SCAPE Platform
- Components
- Data Centre Deployment
- Data Processing Cluster
- Planning and Watch - Requirements and Interfaces
- Platform Release
- Repository System
- Workflow Support
- PT.WP4.Task 1 Guidelines for Taverna Service Developers
- PT.WP.4.MS38 Basic Taverna Workbench available for testbed use
- PT.WP.4 Task 1 CP042 Requirements for the Taverna Workbench
- PT.WP.4 Task 2 CP046 Requirements documents for provenance component
- PT.WP.4 Task 3 CP048 Requirements for the Preservation Components Catalogue
- PT.WP.4 Task 4 CP049 Requirements documents for the preservation sharing platform
- SCAPE Stories
- Demonstrations
- Experimental Datasets
- LSDRT Experimental Datasets
- Austrian National Library - Digital Book Collection
- Austrian National Library Tresor Music Collection
- BL 19th Century Digitized Newspapers
- Danish newspaper - Morgenavisen Jyllandsposten
- Danish Radio broadcasts, mp3
- Danish Radio broadcasts, ripped audio CD’s, and SB in-house audio digitization (WAVfiles)
- Danish scanned books (TIFF format)
- Danish TV broadcasts, mpeg-2 transport stream
- Danish TV broadcasts, mpeg videos
- Danish TV broadcasts (wmv - windows media video)
- Govdocs1 Corpus
- KB Metamorfoze Migration (sample batch)
- MDST Experimental Datasets
- RDST Experimental Datasets
- VDST Experimental Datasets
- WCT Experimental Datasets
- LSDRT Experimental Datasets
- Experimental Platforms
- Stories and Experiments
- Experiment Overview
- Large Scale Digital Repository Testbed
- Characterisation of Large Audio and Video Files
- Large Scale Audio Migration
- Large scale document characterization and identification with Tika and DRIOID on SCAPE Azure platform
- Large Scale Image Migration
- Large Scale Ingest
- Performance of large scale office document migration on SCAPE Azure platform
- Policy-Driven Identification of Preservation Risks in Electronic Document Formats
- Quality Assurance of Digitized Books
- Repository Profiling
- Validation of Archival Content Against an Institutional Policy
- Research Datasets Testbed
- Migration from local format to domain standard format
- Normalise Disparate Tabular Data Sources
- Preserving the context and links to research data or preserving research objects
- Research Object Linkage Monitoring Over Time
- Persistent Data Citation of Dynamically Created Subsets
- Identification, validation and checksumming of a complex corpus
- Web Content Testbed
- ARC to WARC Migration
- Comparison of Web Snapshots
- File Format Identification and Characterisation of Web Archives
- Data Center Testbed
- Large-scale video processing and interlinking
- Medical Dataset
- Large scale access at hospital
- Large scale access for educational purposes
- Large scale analysis
- Analysis of epidemiological situation across WCPT patients
- Evaluation of the age of patients treated in a given period
- Evaluation of the average time of patient’s visit for a given disease codes in a given time period
- Evaluation of the number of abnormal results in laboratory examinations for a given disease codes in a given period
- Evaluation of the number of medical cases for a given period
- Evaluation of the patients gender for a given period
- Analysis of epidemiological situation across WCPT patients
- Large scale ingest of medical data
- Catalogue of Preservation Policy Elements
- Introduction
- SCAPE Policy Framework
- How to use the Catalogue
- Policy Elements
- 1. Guidance Policy Authenticity
- 2. Guidance Policy Bit Preservation
- 3. Guidance Policy Functional Preservation
- 4. Guidance Policy Digital Object
- 5. Guidance Policy Metadata
- 6. Guidance Policy Rights
- 7. Guidance Policy Standards
- 8. Guidance Policy Access
- 9. Guidance Policy Organisation
- 10. Guidance Policy Audit and Certification
- Further Reading
- Published Preservation Policies
Labels:
None
Page:
Technical Coordinator
Page:
Events
Page:
SCAPE Developer's Guide
Page:
SCAPE Members Area
Page:
Examples of working and best practice
Page:
SCAPE Platform
Page:
SCAPE Stories
Page:
Catalogue of Preservation Policy Elements
Page:
Published Preservation Policies