Title
Environmental Artists Datasets. This dataset consists of a set of disk images, a back-ups of a current workstation using TimeMachine, and an email archive exported from Gmail. The dataset contains a wide variety of unknown document types and it is believed that most of the disk images HSF and HSF+.
Description
The dataset consists of a 200 GB disk image of an old iMac (200 GB), and the disk image of a Mac Mini (80.38 GB). These two datasets created by the environmental artists and their students from the late 1990s through 2010 and have not been analyzed.
Licensing
The datasets are the property of Stanford University Libraries. All literary rights reside the document creators. Any copying or replication of the dataset is prohibited without express permission of Stanford University Libraries and the document creators.
Owner
The datasets are the property of Stanford University Libraries. All literary rights reside the document creators.
Dataset Location
Hash files of the datasets are available here is zip file - iMac_dup.zip
NOTE - Due to the shortness of time the percentage of duplication column was created in excel rather than incorporated into the script. With more time and effort we could script this function.
Dataset Champion
Issues brainstorm
- it would be useful to generate a report that outlines the similarities between all of the related datasets such as:
- percentage of files that are duplicates
- graph of overlap between various disk images
- user profiles graph - what users were working with particular groups of content over time
- it would be useful to provide the collection creator with the ability to redact documents and construct a view of their born digital archive that can be presented online
List of Issues
A list of links to detailed Issue pages relevant to this Dataset. When you've added an Issue page, return to this page and add a link to it.