Skip to end of metadata
Go to start of metadata

This page is to be used for collecting information about benchmarks from SCAPE partners' Hadoop installations.  The hope is that it will enable comparison of installations.

Details are in this blog post: http://openplanetsfoundation.org/blogs/2013-09-30-let%E2%80%99s-benchmark-our-hadoop-clusters-join

Aggregate results

  Ratios of throughput (bytes/sec)    
Type BL SB 1 SB 2   Limiting capabilities in test
NUTCHINDEX 1.0 3.2 1.9   Balanced
WORDCOUNT 1.0 3.4 4.2   CPU-bound
DFSIOE-READ 1.0 2.5 1.9   IO-bound
DFSIOE-WRITE 1.0 3.4 1.4   IO-bound
HIVEAGGR 1.0 0.0 39.4   ?
HIVEJOIN 1.0 0.0 68.6   ?
KMEANS 1.0 0.0 2.1   Map: CPU-bound, Reduce: IO-bound
PAGERANK 1.0 3.1 1.2   ? Network bound?
BAYES 1.0 1.8 1.7   Balanced
SORT 1.0 2.9 0.9   IO-bound
TERASORT 1.0 2.2 1.2   RAM-bound, Map: CPU-bound, Reduce: IO-bound
           
        Categories from: https://github.com/intel-hadoop/HiBench/raw/master/WISS10_conf_full_011.pdf
 Approximate ratios, per workload type (by manual estimation)
         
  BL SB 1 SB 2    
Balanced workload (IO/RAM/CPU) 1 2.5 1.8    
IO-bound workload 1 3 1.7    
CPU-bound workload 1 3.4 4.2    

British Library

Cluster:

Results for British library Digital Preservation Hadoop cluster:

(1 JobTracker/NameNode, 28 TaskTracker/DataNodes, 6GB RAM/1 CPU/500GB HDD per node)

Results:

Type Date Time Input_data_size Duration(s) Throughput(bytes/s) Throughput/node
NUTCHINDEX 18/09/2013 14:11:02 586453704 241.8 2425366 86620
WORDCOUNT 18/09/2013 11:30:51 89600175810 1292.92 69300634 2475022
DFSIOE-READ 18/09/2013 11:51:54 54005188956 368.998 146356318 5227011
DFSIOE-WRITE 18/09/2013 12:00:10 27323986498 488.494 55935152 1997684
HIVEAGGR 18/09/2013 12:12:02 17713294775 226.04 78363540 2798697
HIVEJOIN 18/09/2013 12:17:34 18540722846 305.227 60744045 2169430
KMEANS 19/09/2013 09:58:57 504003386 334.933 1504788 53742
PAGERANK 18/09/2013 12:23:11 398276167 186.61 2134270 76223
BAYES 18/09/2013 14:29:29 177879705 1004.847 177021 6322
SORT 18/09/2013 12:47:02 67200178803 913.232 73585002 2628035
TERASORT 18/09/2013 12:53:09 10000000000 201.578 49608588 1771735

Danish State and University Library

(1 JobTracker/NameNode, 3 TaskTracker/DataNodes, 96GB RAM / 2 CPU (6 cores - 12 threads) / ? HDD per node)

Results:

Type Date Time Input_data_size Duration(s) Throughput(bytes/s) Throughput/node
NUTCHINDEX 2013-11-26 09:28:55 594889272 77.383 7687596 2562532
WORDCOUNT 2013-11-26 09:38:15 96000020276 410.418 233907918 77969306
DFSIOE-READ 2013-11-26 09:44:04 54005177946 150.323 359260911 119753637
DFSIOE-WRITE 2013-11-26 09:46:29 27319740657 142.999 191048473 63682824
HIVEAGGR N/A (hive not installed)
         
HIVEJOIN N/A (hive not installed)          
KMEANS N/A (problem not identified)          
PAGERANK 2013-11-26 09:48:20 398276167 60.617 6570370 2190123
BAYES 2013-11-26 09:58:18 180094898 574.317 313580 104526
SORT 2013-11-26 10:05:53 72000018177 332.034 216845317 72281772
TERASORT 2013-11-26 10:08:12 10000000000 89.762 111405717 37135239

Note: EMC Isilon setup is handling name node and data nodes

Type Date Time Input_data_size Duration(s) Throughput(bytes/s) Throughput/node
NUTCHINDEX 2014-01-24 08:45:46 594890284 130.206 4568839 1142209
WORDCOUNT 2014-01-24 08:58:56 128000028859 442.608 289195018 72298754
DFSIOE-READ 2014-01-24 09:11:23 54005212089 193.870 278564048 69641012
DFSIOE-WRITE 2014-01-24 09:17:07 27322382558 341.443 80020332 20005083
HIVEAGGR 2014-01-24 09:26:13 17590025456 5.691 3090849667 772712416
HIVEJOIN 2014-01-24 09:26:27 18417453527 4.420 4166844689 1041711172
KMEANS 2014-01-24 09:30:09 504003386 159.311 3163644 790911
PAGERANK 2014-01-24 09:33:25 398276167 152.253 2615883 653970
BAYES 2014-01-24 09:43:39 180094898 589.064 305730 76432
SORT 2014-01-24 10:13:31 96000029057 1519.260 63188676 15797169
TERASORT 2014-01-24 10:18:33 10000000000 168.287 59422296 14855574

Note: Each node has a network storage (nfs mount) as their 'local' storage

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.