hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yoram Arnon" <yar...@yahoo-inc.com>
Subject RE: Rebalancing a DFS cluster
Date Wed, 25 Oct 2006 18:14:41 GMT
Online, you can copy files/folders to a new location, then delete the
original and rename. The new data will be uniformly distributed. Distcp is
useful for copying around large amounts of data.
Offline, you can move dfs block files from old machines to the new machine 
(scp src:<dfs.data.dir>/blk_????? dst:<dfs.data.dir>). 
On startup, each datanode will report its blocks to the namenode and the
namenode will make sense of it all.

There's no fancy method of rebalancing, let alone proportional block


> -----Original Message-----
> From: David Pollak [mailto:dpp@athena.com] 
> Sent: Wednesday, October 25, 2006 9:35 AM
> To: hadoop-user@lucene.apache.org
> Subject: Rebalancing a DFS cluster
> Howdy,
> I've got a DFS cluster.  I added a new machine to my cluster.  The  
> new machine is the fastest in the cluster (a Core 2 Duo E6600 which  
> blows every machine I've ever used out of the water... but I  
> digress.)  I'd like to rebalance some of the files in my DFS cluster  
> so this machine has files on its local filesystem.  Is there 
> a way to  
> tell DFS to rebalance and (okay this is a wish-list item) put a  
> "speed factor" on each of the slaves so that faster machines 
> will get  
> more data on their local drives so the machines that run faster are  
> more likely to have local data.
> Thanks,
> David

View raw message