View Source

This page describes the installation of SCAPE platform in a High Performance Computation centre. The hardware platform is described [here|http://wiki.opf-labs.org/display/SP/UVT+Hadoop+Platform].

h5. Overview

Deployment of SCAPE platform in large data centres or in Cloud Computing environments raises specific challenges, such as dynamic allocation of components to compute nodes, monitoring the platform, quality of service etc. Automated cluster provisioning and platform deployment is achieved by integrating and/or extending a set of specific tools, such as: a Node Deployment system (Cobbler)

* a node deployment system, such as [Cobbler|http://www.cobblerd.org/], helps administrators to dynamically allocate the nodes: systems can be added and removed from the management of the node deployment and configuration management systems on the fly, both on bare-bones computing hardware and on virtualized computing resources
* On the fly software deployment using the customized Configuration Management System ([Puppet|http://puppetlabs.com]): it allows the evolution of SCAPE software packages by providing “high-level” recipes describing the tools and relations between them. It enables dynamic allocation of SCAPE components to computing resources with minimal human intervention, providing, in this way, a more deterministic software deployment process. It ensures that software is deployed as expected by the developers, meeting all required expectations.
* Monitoring/Quality measures: the integrated Puppet Configuration Management system natively provides capabilities for integration with and deployment of the [Nagios |http://www.nagios.org]monitoring solution. This allows operators/administrators to provide a better QoS.
\\

h5. Cloud Deployment Toolkit for SCAPE Platform

The toolkit aims to provide the software components and corresponding puppet modules for deploying critical SCAPE Components in Cloud Environments. The PoC (Proof of Concept) aims at demonstrating the operation of selected SCAPE Platform Components in Cloud Environments, focusing on Eucalyptus and Amazon EC2 and fostering their scalability for providing on demand computing capacity. The toolkit is composed of:
* Developing a GUI (web based portal) for the management of an SCAPE Platform deployment on Eucalyptus based clouds, and in the next stages on Amazon Web Services EC2
* Integrating Puppet and PuppetDB Rest API’s
* Abstracting EC2 and Eucalyptus API for providing an uniform programming environment, ensuring this way the portability

Another tool aims to provide integration between the Hadoop Filesystem (HDFS) and more ‘classical’ products like FTP. It is an Apache Mina based FTP server for exposing HDFS filesystem to local/remote clients that lack HDFS capabilities. One of the main use cases of this tool is to facilitate data staging between legacies HPC systems and Hadoop based computing clusters.