Paweł Kominek, Michał Kozak, Aleksander Stroiński, Tomasz Parkoła
The DICOM medical data for this experiment comes from the overall WCPT dataset described here: http://wiki.opf-labs.org/display/SP/WCPT+medical+dataset. The data for this experiment will be composed of HL7 XML files.
The goal of this experiment is to analyse the epidemiological situation across WCPT patients. It includes the following analysis:
- Age of patients treated in a given period
- Sex of patients treated in a given period
- Number of cases of a given disease in a given period
- Number of abnormal results in laboratory examinations for a given disease codes in a given period
- Average time of patient’s visit for a given disease codes in a given time period
PSNC Hadoop Platform (http://wiki.opf-labs.org/display/SP/PSNC+Hadoop+Platform)
The analysis will be done via means of Hadoop jobs which will be executed on PSNC Hadoop cluster. There will be no specific general workflow, as the algorithm implemented in Hadoop job will handle all of the processing. The main steps in the analysis will be:
- Define input and output and criteria for the analysis
- Implement Hadoop job
- Execute Hadoop job and gather results
- Prepare results in a human-readable way
Links to results of the experiment using the evaluation template.