hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Disher <jdis...@parad.net>
Subject HDFS network bottleneck - namenode?
Date Tue, 10 May 2011 12:50:03 GMT
I will preface this with a couple statements: a) it's almost 6am, and I've been up all night
b) I'm drugged up from an allergic reaction, so I may not be firing on all 64 bits.

Do I correctly understand the HDFS architecture in that the namenode is a network bottleneck
into the system?  I.e., it doesn't really matter how many ethernet interfaces I roll into
my data nodes, I will always be limited in how much traffic I can drive to the HDFS pool by
the network capacity of the namenode?

I am trying to move a -lot- of data, and i'd like to not throttle the namenode (especially
in the old cluster, where I cannot just bond up more interfaces).  If there's a way to spread
the inbound network (for block writes) traffic I'd love to hear it.


View raw message