hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject Re: HDFS is not loading evenly across all nodes.
Date Thu, 18 Jun 2009 20:27:13 GMT
As an addendum, running a DataNode on the same machine as a NameNode is
generally considered a bad idea because it hurts the NameNode's ability to
maintain high throughput.

- Aaron

On Thu, Jun 18, 2009 at 1:26 PM, Aaron Kimball <aaron@cloudera.com> wrote:

> Did you run the dfs put commands from the master node?  If you're inserting
> into HDFS from a machine running a DataNode, the local datanode will always
> be chosen as one of the three replica targets. For more balanced loading,
> you should use an off-cluster machine as the point of origin.
> If you experience uneven block distribution, you should also periodically
> rebalance your cluster by running bin/start-balancer.sh every so often. It
> will work in the background to move blocks from heavily-laden nodes to
> underutilized ones.
> - Aaron
> On Thu, Jun 18, 2009 at 12:57 PM, openresearch <
> Qiming.He@openresearchinc.com> wrote:
>> Hi all
>> I "dfs put" a large dataset onto a 10-node cluster.
>> When I observe the Hadoop progress (via web:50070) and each local file
>> system (via df -k),
>> I notice that my master node is hit 5-10 times harder than others, so hard
>> drive is get full quicker than others. Last night load, it actually crash
>> when hard drive was full.
>> To my understand,  data should wrap around all nodes evenly (in a
>> round-robin fashion using 64M as a unit).
>> Is it expected behavior of Hadoop? Can anyone suggest a good
>> troubleshooting
>> way?
>> Thanks
>> --
>> View this message in context:
>> http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message