Much work has been done on defining languages for capturing properties, and in SCAPE we will attempt to bring these strands together and make the property definitions more useful. The main works to reference here are JHOVE2, XCL and the original Planets Suite Core (PSC) interoperability classes.
The JHOVE2 model looks like this
- Name [String]:
- Property name as defined by the format
- Type [java.lang.reflect.Type]:
- Scalar or collection; Java and JHOVE2 types
- Value [String]:
- The reported result come from calling Type.toString();
- Unit [String]:
- Optional unit of measure label
- Identifier [I8R, String with restrictions, includes various namespaces, e.g. URIs, PUIDs, etc.]:
- Unique JHOVE2 identifier
- Description [String]:
- Optional description of property semantics
- Reference [String]:
- Optional reference to the controller section of the format specification
From which introspection is used to generate full property reports: https://bitbucket.org/jhove2/main/src/60048c6c8432/src/main/java/org/jhove2/core/reportable/info/ReportablePropertyInfo.java
Missing PropertyType Raw, Derived.
Of the XCL tools, only the XCDL part is relevant here, as this is where the schema for properties is defined. Looking at http://xcltools.svn.sourceforge.net/viewvc/xcltools/trunk/extractor/xcl/xcdl/, e.g. http://xcltools.svn.sourceforge.net/viewvc/xcltools/trunk/extractor/xcl/xcdl/XCDLCore.xsd?revision=141&view=markup#l163
And examples like those on page 14 of http://xcltools.svn.sourceforge.net/viewvc/xcltools/trunk/extractor/res/xcl/XCLDocumentation/Planets_PC2-D7_FinalXCDLSpec_Ext.pdf?revision=1 (same as from http://xcltools.svn.sourceforge.net/viewvc/xcltools/trunk/extractor/doc/XCLDocumentation/ ?)
- name [String, ID, (alias)*]
- source [String:(raw|implicit|added)]
- cat [String:(descr|hist|cont|extern)]
- rawValue Function: Wraps the distinct raw value, as extracted from the source object ; by default bytes shall be encoded in UTF-16 for non-binary data, in hex numbers for binary data. (how can you tell?)
- labValue Function: Wrapping element for labelled value.
- val, with optional unit (measureType), and a basic schema (type and number of repeats)
- objectRef Pointer to the source object, restricted to file:// URIs.
- dataRef Function: Reference to data. This can either be the source data (element 'data') or normalized data (element 'normData').
the source the property is derived from. 'raw' =derived from the source object; 'implicit'=property is not fixed to the source object but derived from the source objects format specification. 'added'= property is not raw and implicit, but derived from the file, e.g. filesize or original filename
the properties category: 'descr'=descriptive property, i.e. occurence of object describing property; 'hist'= history property, i.e. property that may appear in a different shape in the source object which may be resolved in the xcdl description (e.g., compressed data);'cont'= content property, i.e. relating directly to a byte sequence; 'extern'= property that refers to external item, i.e. not related to objects data, e.g., software and hardware used to create the object.
This model looks like this:
- URI uri
- String name
- String value
- String unit (e.g. 'dB')
- String description
- String type (e.g. 'double')
All of these Property concepts wrap very similar concepts, but in different ways. XCL in particular attempts to encapsulate some very complex concepts by attempting to generate schemes for units of measurement and references to fragments. The JHOVE2 and PSC models are somewhat simpler and similar, although they rely entirely on context to define the type of entity the property relates to. Issues are
- References to fragments.
- Units (not necessary to model, just pick one for the definition)
- Serialisation. JHOVE2 has buried java.lang.reflect.Type.toString().
- Property types (not necessary)
So, basic proposal is a simple Java type with an XML mapping that is compatible with JHOVE2. Can we pick out JHOVE2 property types? Do we need to?
For SCAPE, we need properties that apply to different entity types. For example, we wish to describe the properties of entities such as:
- A Format
- A Digital Objects
- A Comparison of Two Digital Objects
- A Preservation Tool or Service
- An Executed Processes
- and possibly more.
XCL only deals with properties of digital objects, whereas both JHOVE2 and PSC rely on context to distinguish the cases. For example, the URI for a JHOVE2 property is inferred automatically from the Java package hierarchy, and this scopes properties according to the code module that defines them. The PSC case is simpler still - properties in a CharacteriseResult are properties of a digital object, whereas the Property list inside the ServiceReport are properties of the service execution process.
Proposal N: Properties, Measurements and Characteristics
The proposal is to define Property as above, with a URI as the key. Then a measurement is the pairing of the URI of a Property, and a Value, along with an Agent? A full Characteristic is a de-normalised version of a measurement, where the full Property definition is present alongside the measured Value, but the Agent has been taken outside of the Characteristic (???)