hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehul Choube <Mehul_Cho...@symantec.com>
Subject RE: what happens when a datanode rejoins?
Date Tue, 11 Sep 2012 09:06:08 GMT
> The namenode will asynchronously replicate the blocks to other datanodes in order to maintain
the replication factor after a datanode has not been in contact for 10 minutes.
What happens when the datanode rejoins after namenode has already re-replicated the blocs
it was managing?
Will namenode ask the datanode to discard the blocks and start managing new blocks?
Or will namenode discard the new blocks which were replicated due to unavailability of this
datanode?



Thanks,
Mehul


From: George Datskos [mailto:george.datskos@jp.fujitsu.com]
Sent: Tuesday, September 11, 2012 12:56 PM
To: user@hadoop.apache.org
Subject: Re: what happens when a datanode rejoins?

Hi Mehul
Some of the blocks it was managing are deleted/modified?

The namenode will asynchronously replicate the blocks to other datanodes in order to maintain
the replication factor after a datanode has not been in contact for 10 minutes.


The size of the blocks are now modified say from 64MB to 128MB?

Block size is a per-file setting so new files will be 128MB, but the old ones will remain
at 64MB.


What if the block replication factor was one (yea not in most deployments but say incase)
so does the namenode recreate a file once the datanode rejoins?

(assuming you didn't perform a decommission) Blocks that lived only on that datanode will
be declared "missing" and the files associated with those blocks will be not be able to be
fully read, until the datanode rejoins.



George

Mime
View raw message