hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Segel <michael_se...@hotmail.com>
Subject Re: Sanity check re: value of 10GbE NICs for Hadoop?
Date Wed, 29 Jun 2011 21:04:10 GMT
I'm not sure which point you are trying to make.
To answer to answer your question...

With respect to price... 10GBe is cost effective.
You have to consider 1GBe is not only you port speed but also there is going to be the speed
of the uplink or trunk.

So if you continue to build out, you run in to bandwidth issues between racks. So you end
up doing 1GBe ports and then higher speed by either port bonding or bigger bandwidth for uplinks
only. These switches are more expensive than simple 1GBe switches, but less than full 10GBe.

Depending on vendor, number of ports, discount, you can get the switch for approx 10,000 and
up. Think $550 to $600 a port for 10GBe.

With Sandy Bridge, you will start to see 10GBe on the mother boards.

If you're following discussion on the performance gains, you'll start to see the network being
the bottleneck.

If you are planning to build a new cluster... You should plan on 10gbe.







Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 29, 2011, at 1:07 AM, Bharath Mundlapudi <bharathwork@yahoo.com> wrote:
> One could argue that its too early for 10Gb NIC in Hadoop Cluster. Certainly having extra
bandwidth is good but at what price?
> 
> 
> Please note that all the points you mentioned can work with 1Gb NICs today. Unless if
you can back with price/performance data. Typically, Map output is compressed. If system is
hitting peak network utilization, one can select high compression rate algorithms at the cost
of CPU.  Most of these machines comes with dual NIC cards, so one could do link bonding to
push more bits.
> 
> 
> One area may have good benefit of 10Gb NIC is High Density Systems - 24 core and 3x12TB
disks. This is the trend now and will continue. These systems can saturate the 1Gb NICs. 
> 
> 
> -Bharath
> 
> 
> 
> ________________________________
> From: Saqib Jang -- Margalla Communications <saqibj@margallacomm.com>
> To: common-user@hadoop.apache.org
> Sent: Tuesday, June 28, 2011 10:16 AM
> Subject: Sanity check re: value of 10GbE NICs for Hadoop?
> 
> Folks,
> 
> I've been digging into the potential benefits of using 
> 
> 10 Gigabit Ethernet (10GbE) NIC server connections for
> 
> Hadoop and wanted to run what I've come up with
> 
> through initial research by the list for 'sanity check'
> 
> feedback. I'd very much appreciate your input on
> 
> the importance (or lack of it) of the following potential benefits of
> 
> 10GbE server connectivity as well as other thoughts regarding
> 
> 10GbE and Hadoop (My interest is specifically in the value
> 
> of 10GbE server connections and 10GbE switching infrastructure, 
> 
> over scenarios such as bonded 1GbE server connections with 
> 
> 10GbE switching).
> 
> 
> 
> 1.       HDFS Data Loading. The higher throughput enabled by 10GbE
> 
> server and switching infrastructure allows faster processing and 
> 
> distribution of data.
> 
> 2.       Hadoop Cluster Scalability. High-performance for initial data
> processing
> 
> and distribution directly impacts the degree of parallelism or scalability
> supported
> 
> by the cluster.
> 
> 3.       HDFS Replication. Higher speed server connections allows faster
> file replication.
> 
> 4.       Map/Reduce Shuffle Phase. Improved end-to-end throughput and
> latency directly impact the 
> 
> shuffle phase of a data set reduction especially for tasks that are at the
> document level 
> 
> (including large documents) and lots of metadata generated by those
> documents as well as video analytics and images.
> 
> 5.       Data Reporting. 10GbE server networking etwork performance can 
> 
> improve data reporting performance, especially if the Hadoop cluster is
> running 
> 
> multiple data reductions. 
> 
> 6.       Support of Cluster File Systems.  With 10 GbE NICs, Hadoop could be
> reorganized 
> 
> to use a cluster or network file system. This would allow Hadoop even with
> its Java implementation 
> 
> to have higher performance I/O and not have to be so concerned with disk
> drive density in the same server.
> 
> 7.       Others?
> 
> 
> 
> 
> 
> thanks,
> 
> Saqib
> 
> 
> 
> Saqib Jang
> 
> Principal/Founder
> 
> Margalla Communications, Inc.
> 
> 1339 Portola Road, Woodside, CA 94062
> 
> (650) 274 8745
> 
> www.margallacomm.com

Mime
View raw message