Skip to end of metadata
Go to start of metadata

This is a brief outline of the web archiving team's Hadoop cluster, from which the BL SCAPE team have negotiated access and storage space so that it can act as our SCAPE testbed.

Basic Cluster Software

  • Red Hat Enterprise Linux Server release 5.7 (Tikanga)
  • Cloudera CDH3u1
    • Hadoop 0.20.2
  • Java 1.6.0_26

Cluster Hardware

We have two racks, each containing 25 HP DL120 G6 servers. Each server (node) is:

CPU: Intel Xeon X3450 (2.66GHz/4-core/8MB/95W)
NIC: 2x HP NC107i PCI Express Gigabit Server Adapter
DISK: Upgraded to include 3 x 2TB disks per node.

Therefore, we have a total of 200 cores, with 800 GB of RAM and 300 TB of disk space for HDFS (i.e. 100 TB of triply-redundant storage).

Network Infrastructure

Each rack contains a FASTIRON EDGE X448 (FESX448) switch and are connected via a 10Gb link.

Front end

This server acts as the user front-end from which Hadoop jobs are submitted and upon which any post-processing is performed.

CPU: 24 core
NIC: ?

Name Node and Task Tracker

Separate servers for the Name Node and Task Tracker. Both are:

HP Proliant DL160 G6 Server Rack (1UI)
CPU: Intel Xeon (Quad-Core) (E5520) 2.26GHz
NIC: Embedded HP NC362i Dual Port 1 Gigabit

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.