View Source

This page describes the installation of SCAPE platform in a High Performance Computation centre. The hardware platform is described [here|http://wiki.opf-labs.org/display/SP/UVT+Hadoop+Platform].

Nota: This page concludes *MS91 - Services for integrating remote working*, due in M38.


h5. UVT Data Center services

Based on the requirements of the project partners (BUT, FIZ etc), UVT provides a set of services aimed at providing support for the various sub-projects and activities within SCAPE. The services provided can be classified as:
* Execution services
* Customisation services
* Data storage services
* IaaS services.

h6. Execution Services

The execution services provided by UVT cover several facilities, including (but not restricted to): Batch Scheduling, MapReduce services, and QosCosGrid Compute API.

The Batch Scheduling service is based on the IBM ® LoadLeveler scheduling system. The service allows the SCAPE partners to use UVT’s computing facilities, featuring more the 40 x86 servers. This service also allows access to specialised resources like GPU computing nodes. The Batch Scheduling service is complemented by the QosCosGrid API developed by Application Department from PSNC. The QCG API allows integration with partners applications.

Besides the two batch oriented services, UVT also provides access to Hadoop resources both as a SCAPE dedicated cluster (7 dedicated HP AMD64 servers) and on demand Hadoop clusters based on the [SCAPE Cloud Deployment toolkit|http://wiki.opf-labs.org/display/SP/Portability+of+SCAPE+Platform+over+Multiple+Cloud+Environments].


h7. Customisation Services

On top of providing the aforementioned execution services, UVT also provides support for customising existing services for the requirements of SCAPE Users. Examples include specific runtime configurations, both software and hardware. For instance, off-screen CUDA rendering support was provided specifically to BUT, allowing the execution of OpenGL applications on top of headless GPU systems.


h6. Data Storage Services

The Data Storage Services provide the project partners with storage space on UVT’s infrastructure. The storage services include both GPFS storage (available as FTP/SFTP) and Hadoop HDFS storage.

The HDFS storage service is accessible both directly from Hadoop jobs and also by means of the HDFS FTP Server developed by UVT in the frame of the SCAPE Project. The FTP servers allows remote manipulation of the HDFS filesystem from legacy applications or commodity file transfer utilities. Check project page on [Bitbucket|https://bitbucket.org/scapeuvt/ftp-hadoop] for source code and installation instructions.


h6. IaaS Services

As part of the SCAPE Project, UVT provides IaaS hosting services for SCAPE Consortium members. This hosting services include the ability of deploying Virtual Machines on top of the IaaS infrastructure, specifically the Eucalyptus middleware. For instance, FIZ team members use the IaaS hosting facility leveraging on VMs to host Fedora Directory development.



h5. PSNC Data Center services

Based on the [scenarios identified by WCPT|SP:Medical Dataset] it was agreed to implement and deploy several services at PSNC for integrating WCPT working environment. The following services are needed to execute specific scenarios at WCPT:
* DICOM Download server - this service is responsible for providing access to all anonymized DICOM files stored at PSNC. This service is necessary to execute scenario named [large-scale access at hospital|SP:Large scale access at hospital] because the working environment at WCPT needs stored DICOM files in order to present them for all interested WCPT users. It will be also used in the educational portal (related to [large scale access for educational purposes|SP:Large scale access for educational purposes] scenario) as a background service for accessing DICOM files. Project repository: [https://git.man.poznan.pl/stash/projects/SCAP/repos/dds/browse]
* DICOM HDFS-enabled server - this service provides the possibility to upload anonymized DICOM files to the PSNC's HDFS cluster. It is necessary to execute scenario named [large-scale ingest of medical data|SP:Large scale ingest of medical data] as the working environment at WCPT needs to transfer anonymized DICOM files to PSNC Data Center. This service is also important building block for the scenario named [large scale access for educational purposes|SP:Large scale access for educational purposes] because the educational portal will provide on-line viewer of the stored DICOM files. Project repository: [https://git.man.poznan.pl/stash/projects/SCAP/repos/dicom/browse]
* HL7 HDFS-enabled gateway - this service provides the possibility to upload HL7 metadata files about patient's visits. It is necessary to execute scenarios named [large-scale ingest of medical data|SP:Large scale ingest of medical data] as the working environment at WCPT needs to transfer HL7 files to PSNC Data Center. This service is also important in the context of scenarios named [large scale access for educational purposes|SP:Large scale access for educational purposes] and [large scale analysis|SP:Large scale analysis] because in the former scenario HL7 needs to be presented on-line, while in the latter scenario HL7 files will be analyzed with dedicated Hadoop jobs. Project repository: [https://git.man.poznan.pl/stash/projects/SCAP/repos/hl7/browse] 

All of the services are currently deployed on the PSNC Development cluster and are the subject of tests at the WCPT working environment. After the testing period and additional improvements that will be necessary to implement, it is envisioned to start production mode environment. The final deployment of all the tools at the Data Center is envisioned to happen in M40.