hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: question about hdfs data loss risk
Date Sun, 27 Oct 2013 22:03:48 GMT

1) You may want to read about proper node decommissioning.

2) NameNode will replicate blocks when they do not comply with their
replication factor.

3) NameNode does not give up.

4) Yes, ultimately, if you have a replication factor of n and the n
replicas are lost at the same time, well, the data is truly lost. But
that's not specific to Hadoop.


On Sun, Oct 27, 2013 at 7:42 PM, Koert Kuipers <koert@tresata.com> wrote:

> i have a cluster with replication factor 2. wit the following events in
> this order, do i have data loss?
> 1) shut down a datanode for maintenance unrelated to hdfs. so now some
> blocks only have replication factor 1
> 2) a disk dies in another datanode. let's assume some blocks now have
> replication factor 0 since they were on this disk that died and on the
> datanode that is shut down for maintenance.
> 3) bring back up the datanode that was down for maintenance.
> what i am worried about is that the namenode gives up on a block with
> replication factor 0 after steps 1) and 2) and considers it lost, and by
> the time the replica will come back on again in step 3) the namenode no
> longer considers the block to be existent.
> thanks! koert

Bertrand Dechoux

View raw message