Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <24099585.post@talk.nabble.com>
References: <24099585.post@talk.nabble.com>
From: Aaron Kimball <aaron@cloudera.com>
Date: Thu, 18 Jun 2009 13:26:34 -0700
Message-ID: <d6d7c4410906181326o3d9e29dfw4e571ed93a797b6c@mail.gmail.com>
Subject: Re: HDFS is not loading evenly across all nodes.
To: core-user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=00163691fe65881666046ca53a44

--00163691fe65881666046ca53a44
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Did you run the dfs put commands from the master node?  If you're inserting
into HDFS from a machine running a DataNode, the local datanode will always
be chosen as one of the three replica targets. For more balanced loading,
you should use an off-cluster machine as the point of origin.

If you experience uneven block distribution, you should also periodically
rebalance your cluster by running bin/start-balancer.sh every so often. It
will work in the background to move blocks from heavily-laden nodes to
underutilized ones.

- Aaron

On Thu, Jun 18, 2009 at 12:57 PM, openresearch <
Qiming.He@openresearchinc.com> wrote:

>
> Hi all
>
> I "dfs put" a large dataset onto a 10-node cluster.
>
> When I observe the Hadoop progress (via web:50070) and each local file
> system (via df -k),
> I notice that my master node is hit 5-10 times harder than others, so hard
> drive is get full quicker than others. Last night load, it actually crash
> when hard drive was full.
>
> To my understand,  data should wrap around all nodes evenly (in a
> round-robin fashion using 64M as a unit).
>
> Is it expected behavior of Hadoop? Can anyone suggest a good
> troubleshooting
> way?
>
> Thanks
>
>
> --
> View this message in context:
> http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

--00163691fe65881666046ca53a44--