compared with
Current by Niels Bjarke Reimer
on Jul 19, 2013 12:16.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (9)

View Page History
| *Title* \\ | IS45 Audio and Video Recordings Missing Important Metadata have unreliable broadcast time information \\ |
| *Detailed description* | The Danish State and University Library (SB) holds large collections of Radio and TV Broadcasts. \\
\\
The radio broadcast files are WAV (22.05khz, 16 bit). The file duration of the testbed files is approximately 20 minutes to 10.5 hours. This means some recordings cover a number of shows. The metadata of the files are Radio Channel ID, start time and end time (part of file names). The SB also has the program listings in a different collection. There is however no link between recordings and program information. \\
The duration of the WAV (22.05khz, 16 bit) Danish radio broadcast files in the testbed is approximately 20 minutes to 10.5 hours. This means some recordings cover a number of shows. The mpeg-2 video with Danish TV broadcasts in the testbed dataset are approximately 20 minutes to 17 hours, containing a number of shows. The mpeg-1 video with Danish TV broadcasts in the testbed dataset are approximately 10 minutes to 16 hours, again containing a number of shows. The metadata of the files are Radio or TV Channel ID, start time and end time (part of file names). The SB also has the program listings in a different collection. The recording start and end times are however usually 'a few minutes early' (just before the top of the hour) and 'a few minutes late', e.g. from 2 minutes to 9 am till 3 minutes past 10 am. Also the programs do not always start precisely at the announced time\! It would be nice to link the program listings to exact timestamps in the audio and video files, as this would make it possible to cut out single programs automatically when requested.\\
\\
(Note the [mpeg-2 transport stream with Danish TV broadcasts|Danish TV broadcasts, mpeg-2 transport stream] are one hour recordings. These also contain metadata on the shows being sent.) \\
\\ |
The mpeg-2 video with Danish TV broadcasts in the testbed dataset are approximately 20 minutes to 17 hours, containing a number of shows. The metadata of the files are TV Channel ID, start time and end time (part of file names). The SB also has the program listings in a different collection. There is however no link between recordings and program information. \\
\\
The mpeg-1 video with Danish TV broadcasts in the testbed dataset are approximately 10 minutes to 16 hours, again containing a number of shows. The metadata of the files are TV Channel ID, start time and end time (part of file names). The SB also has the program listings in a different collection. There is however no link between recordings and program information. \\
\\
(Note the [mpeg-2 transport stream with Danish TV broadcasts|SP:mpeg-2 transport stream with Danish TV broadcasts] are one hour recordings. These also contain metadata on the shows being sent.) \\
\\
DRAFT |
| *Scalability Challenge* \\ | _What requirements are placed on the solution in terms of the SCAPE scales of scalability: content size, volume of content, complexity of content_ \\ |
| *[Issue champion|SP:Responsibilities of the roles described on these pages]* | _Who owns the issue? Identify the owner with a link to their contact page on the SCAPE Sharepoint site, as well as identifying their institution in brackets. Eg:_ [Schlarb Sven|https://portal.ait.ac.at/sites/Scape/_layouts/userdisp.aspx?ID=32] (ONB). Also note what the role of the Issue Champion is within their organisation. |
| *Other interested parties* \\ | _Any other parties who are also interested in applying Issue Solutions to their Datasets. Identify the party with a link to their contact page on the SCAPE Sharepoint site, as well as identifying their institution in brackets. Eg:_ [Schlarb Sven|https://portal.ait.ac.at/sites/Scape/_layouts/userdisp.aspx?ID=32] _(ONB)_ |
| *Possible Solution approaches* | _Brief brainstorm of possible approaches to solving the Issue. Each approach should be described in a single sentence as part of a bulleted list. Note that actual Solutions will be owned by the_ *{_}Solution Provider{_}* _who should be a different person from the Issue Champion. Reaching a satisfactory conclusion for the Issue should be considered a team effort between these parties._ \\ |
| *Scalability Challenge* \\ | The combined size of the collections in question is 630 TB. \\ |
| *[Issue champion|SP:Responsibilities of the roles described on these pages]* | [Bolette Jurik|https://portal.ait.ac.at/sites/Scape/_layouts/userdisp.aspx?ID=59] (SB) |
| *Other interested parties* \\ | |
| *Possible Solution approaches* | Most of the programs start with a jingle or some sort of recognizable intro. If we can search for this in the audio and video files, we would be able to find the exact start times of different shows. \\ |
| *Context* | _Details of the institutional context to the Issue. (May be expanded at a later date)_ \\ |
| *Lessons Learned* | _Notes on Lessons Learned from tackling this Issue that might be useful to inform the development of Future Additional Best Practices, Task 8 (SCAPE TU.WP.1 Dissemination and Promotion of Best Practices)_ \\ |
| *Training Needs* | _Is there a need for providing training for the Solution(s) associated with this Issue? Notes added here will provide guidance to the SCAPE TU.WP.3 Sustainability WP._ \\ |
| *Datasets* | * [WAV with Danish Radio broadcasts, ripped audio CD’s, and SB in-house audio digitization|SP:WAV with Danish Radio broadcasts, ripped audio CD’s, and SB in-house audio digitization (WAVfiles)]
* [SP:mpeg video with Danish TV broadcasts] |
| *Solutions* | _Reference to the appropriate Solution page(s), by hyperlink_ |
* [Danish TV broadcasts, mpeg videos] |
| *Solutions* | [SP:SO36 Perform scalable search for small sound chunks in large audio archive]\\
[SP:SO2 xcorrSound QA audio comparison tool]\\ |