Skip to end of metadata
Go to start of metadata


Paweł Kominek, Michał Kozak, Aleksander Stroiński, Tomasz Parkoła


The DICOM medical data for this experiment comes from the overall WCPT dataset described here: The data for this experiment will be composed of HL7 XML files.

Purpose of this experiment

The goal of this experiment is to analyse the epidemiological situation across WCPT patients. It includes the following analysis:

  • Age of patients treated in a given period
  • Sex of patients treated in a given period
  • Number of cases of a given disease in a given period
  • Number of abnormal results in laboratory examinations for a given disease codes in a given period
  • Average time of patient’s visit for a given disease codes in a given time period


PSNC Hadoop Platform (


The analysis will be done via means of Hadoop jobs which will be executed on PSNC Hadoop cluster. There will be no specific general workflow, as the algorithm implemented in Hadoop job will handle all of the processing. The main steps in the analysis will be:

  1. Define input and output and criteria for the analysis
  2. Implement Hadoop job
  3. Execute Hadoop job and gather results
  4. Prepare results in a human-readable way


Links to results of the experiment using the evaluation template.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.