hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: What happens in HDFS DataNode recovery?
Date Sun, 25 Jan 2009 03:40:41 GMT
The blocks will be invalidated on the returned to service datanode.
If you want to save your namenode and network a lot of work, wipe the hdfs
block storage directory before returning the Datanode to service.
dfs.data.dir will be the directory, most likley the value is
${hadoop.tmp.dir}/dfs/data

Jason - Ex Attributor

On Sat, Jan 24, 2009 at 6:19 PM, C G <parallelguy@yahoo.com> wrote:

> Hi All:
>
> I elected to take a node out of one of our grids for service.  Naturally
> HDFS recognized the loss of the DataNode and did the right stuff, fixing
> replication issues and ultimately delivering a clean file system.
>
> So now the node I removed is ready to go back in service.  When I return it
> to service a bunch of files will suddenly have a replication of 4 instead of
> 3.  My questions:
>
> 1.  Will HDFS delete a copy of the data to bring replication back to 3?
> 2.  If (1) above is  yes, will it remove the copy by deleting from other
> nodes, or will it remove files from the returned node, or both?
>
> The motivation for asking the questions are that I have a file system which
> is extremely unbalanced - we recently doubled the size of the grid when a
> few dozen terabytes already stored on the existing nodes.  I am wondering if
> an easy way to restore some sense of balance is to cycle through the old
> nodes, removing each one from service for several hours and then return it
> to service.
>
> Thoughts?
>
> Thanks in Advance,
> C G
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message