Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-user@lucene.apache.org
Received-SPF: neutral (herse.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
	h=from:to:references:subject:date:message-id:mime-version:
	content-type:content-transfer-encoding:x-mailer:in-reply-to:x-mimeole:thread-index;
	b=jC9Ibndt0R8Y0rNq5Km39Oyy0ShtGXN00hY4Mm7z+7gqll0eWeBXmAiuu2/mGDkI
From: "Dhruba Borthakur" <dhruba@yahoo-inc.com>
To: <hadoop-user@lucene.apache.org>
References: <464B2D0A.3070906@dragonflymc.com>
Subject: RE: Redistribute blocks evenly across DFS
Date: Wed, 16 May 2007 10:08:07 -0700
Message-ID: <016001c797dc$c635ea90$639115ac@ds.corp.yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
In-Reply-To: <464B2D0A.3070906@dragonflymc.com>
Thread-Index: AceX1O5JvnDIKN+GQ/S0pWMKRa8MpAABuoTg

I think HDFS always makes every effort to fill up most Datanodes uniformly.
Anomaly arises when a large set of Datanodes are added to an existing
cluster. In this case one possible approach would be to write a tool that
does the following:

1. increase the replication factor of each file. This will automatically
create a new replica in those nodes that have more free disk-space and
lightly loaded.

2. then decrease the replication factor of the file to its original. The
HDFS code will automatically select the replica on the most-full node to be
deleted. (see Hadoop-1300)

The tool could take a set of HDFS directories as input and then do the above
two steps on all files (recursively) in the set of specified directories.

Will this approach address your issue?

Thanks,
dhruba

-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com] 
Sent: Wednesday, May 16, 2007 9:11 AM
To: hadoop-user@lucene.apache.org
Subject: Redistribute blocks evenly across DFS

Is there a way to redistribute blocks evenly across all DFS nodes.  If 
not I would be happy to program a tool to do so but I would need a 
little guidance on howto.

Dennis Kubes