hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject What happens in HDFS DataNode recovery?
Date Sun, 25 Jan 2009 02:19:03 GMT
Hi All:

I elected to take a node out of one of our grids for service.  Naturally HDFS recognized the
loss of the DataNode and did the right stuff, fixing replication issues and ultimately delivering
a clean file system.

So now the node I removed is ready to go back in service.  When I return it to service a bunch
of files will suddenly have a replication of 4 instead of 3.  My questions:

1.  Will HDFS delete a copy of the data to bring replication back to 3?
2.  If (1) above is  yes, will it remove the copy by deleting from other nodes, or will it
remove files from the returned node, or both?

The motivation for asking the questions are that I have a file system which is extremely unbalanced
- we recently doubled the size of the grid when a few dozen terabytes already stored on the
existing nodes.  I am wondering if an easy way to restore some sense of balance is to cycle
through the old nodes, removing each one from service for several hours and then return it
to service.


Thanks in Advance, 


View raw message