
Code implementation is here: [https://github.com/openplanets/tabular-data-normaliser]
A detailed descriptions of the workflow is available in this blog post: [http://www.openplanetsfoundation.org/blogs/2013-03-01-tabular-data-normalisation-tool]
The code makes use of MapReduce/Hadoop, data normalisation of an input file occurs in the map, collation of results occurs in the reduce phase.
h2. Requirements/Evaluation Criteria/Conditions of Satisfaction
An assessment was made that the code was generic enough to be used on other datasets, through using another test set of data and checking the outputs. The only required change was a new version of the normalisation properties file relevant to the input data.
h2.