hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Herberts <mathias.herbe...@gmail.com>
Subject Re: distcp performing much better for rebalancing than dedicated balancer
Date Thu, 05 May 2011 12:57:26 GMT
Did you explicitely start a balancer or did you decommission the nodes
using dfs.hosts.exclude and a dfsadmin -refreshNodes?

On Thu, May 5, 2011 at 14:30, Ferdy Galema <ferdy.galema@kalooga.com> wrote:
> Hi,
>
> On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed that
> distcp does a much better job at rebalancing than the dedicated balancer
> does. We needed to decommision 11 nodes, so that prior to rebalancing we had
> 4 used and 11 empty nodes. The 4 used nodes had about 25% usage each. Most
> of our files are of average size: We have about 500K files in 280K blocks
> and 800K blocks total (blocksize is 64MB).
>
> So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
> cluster. Started the balancer tool and I noticed that the it moved about
> 200GB in 1 hour. (I grepped the balancer log for "Need to move").
>
> After stopping the balancer I started a distcp.  This tool copied 900GB in
> just 45 minutes, with an average replication of 2 so it's total throughput
> was around 2.4 TB/hour. Fair enough, it is not purely rebalancing because
> the 4 overused nodes also get new blocks, still it performs much better.
> Munin confirms the much higher disk/ethernet throughputs of the distcp.
>
> Are these characteristics to be expected? Either way, can the balancer be
> boosted even more? (Aside the dfs.balance.bandwidthPerSec property).
>
> Ferdy.
>

Mime
View raw message