hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Saqib Jang -- Margalla Communications" <saq...@margallacomm.com>
Subject Sanity check re: value of 10GbE NICs for Hadoop?
Date Tue, 28 Jun 2011 17:16:53 GMT
Folks,

I've been digging into the potential benefits of using 

10 Gigabit Ethernet (10GbE) NIC server connections for

Hadoop and wanted to run what I've come up with

through initial research by the list for 'sanity check'

feedback. I'd very much appreciate your input on

the importance (or lack of it) of the following potential benefits of

10GbE server connectivity as well as other thoughts regarding

10GbE and Hadoop (My interest is specifically in the value

of 10GbE server connections and 10GbE switching infrastructure, 

over scenarios such as bonded 1GbE server connections with 

10GbE switching).

 

1.       HDFS Data Loading. The higher throughput enabled by 10GbE

server and switching infrastructure allows faster processing and 

distribution of data.

2.       Hadoop Cluster Scalability. High-performance for initial data
processing

and distribution directly impacts the degree of parallelism or scalability
supported

by the cluster.

3.       HDFS Replication. Higher speed server connections allows faster
file replication.

4.       Map/Reduce Shuffle Phase. Improved end-to-end throughput and
latency directly impact the 

shuffle phase of a data set reduction especially for tasks that are at the
document level 

(including large documents) and lots of metadata generated by those
documents as well as video analytics and images.

5.       Data Reporting. 10GbE server networking etwork performance can 

improve data reporting performance, especially if the Hadoop cluster is
running 

multiple data reductions. 

6.       Support of Cluster File Systems.  With 10 GbE NICs, Hadoop could be
reorganized 

to use a cluster or network file system. This would allow Hadoop even with
its Java implementation 

to have higher performance I/O and not have to be so concerned with disk
drive density in the same server.

7.       Others?

 

 

thanks,

Saqib

 

Saqib Jang

Principal/Founder

Margalla Communications, Inc.

1339 Portola Road, Woodside, CA 94062

(650) 274 8745

www.margallacomm.com

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message