hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@yahoo-inc.com>
Subject RE: Redistribute blocks evenly across DFS
Date Wed, 16 May 2007 17:08:07 GMT
I think HDFS always makes every effort to fill up most Datanodes uniformly.
Anomaly arises when a large set of Datanodes are added to an existing
cluster. In this case one possible approach would be to write a tool that
does the following:

1. increase the replication factor of each file. This will automatically
create a new replica in those nodes that have more free disk-space and
lightly loaded.

2. then decrease the replication factor of the file to its original. The
HDFS code will automatically select the replica on the most-full node to be
deleted. (see Hadoop-1300)

The tool could take a set of HDFS directories as input and then do the above
two steps on all files (recursively) in the set of specified directories.

Will this approach address your issue?


-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com] 
Sent: Wednesday, May 16, 2007 9:11 AM
To: hadoop-user@lucene.apache.org
Subject: Redistribute blocks evenly across DFS

Is there a way to redistribute blocks evenly across all DFS nodes.  If 
not I would be happy to program a tool to do so but I would need a 
little guidance on howto.

Dennis Kubes

View raw message