hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: what happens when a datanode rejoins?
Date Tue, 11 Sep 2012 08:03:20 GMT
George has answered most of these. I'll just add on:

On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube
<Mehul_Choube@symantec.com> wrote:
> 1.       Some of the blocks it was managing are deleted/modified?

A DN runs a block report upon start, and sends the list of blocks to
the NN. NN validates them and if it finds any files to miss block
replicas post-report, it will schedule a re-replication from one of
the good DNs that still carry it. The modified (out-of-HDFS) blocks
fail their stored checksums so are treated as corrupt and deleted, and
are re-replicated in the same manner.

> 2.       The size of the blocks are now modified say from 64MB to 128MB?

George's got this already. Changing of block size does not impact any
existing blocks. It is a per-file metadata prop.

> 3.       What if the block replication factor was one (yea not in most
> deployments but say incase) so does the namenode recreate a file once the
> datanode rejoins?

Files exist at the NN metadata (its fsimage/edits persist this).
Blocks pertaining to a file exists at a DN. If the file had a single
replica and that replica was lost, then the file's data is lost and
the NameNode will tell you as much in its metrics/fsck.

Harsh J

View raw message