hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Thakorlal <vijayj...@hotmail.com>
Subject RE: NameNode failure and recovery!
Date Wed, 03 Apr 2013 14:56:40 GMT
Hi Rahul,


The SNN does not act as a backup / standby NameNode in the event of failure. 


The sole purpose of the Secondary NameNode (or as it’s otherwise / more correctly known
as the Checkpoint Node) is to perform checkpointing of the current state of HDFS:


The SNN retrieves the fsimage and edits files from the NN 

The NN rolls the edits file

The SNN Loads the fsimage into memory 

Then the SNN replays the edits log file to merge the two

Then the SNN transfers the merged checkpoint back to the NN

The NN uses the checkpoint as the new fsimage file


It’s true that technically you could use the fsimage from the SNN if completely lost the
NN – and yes as you said you would “lose” any changes to HDFS that occurred between
the NN dieing and the last time the checkpoint occurred. But as mentioned the SNN is not a
backup for the NN.





From: Rahul Bhattacharjee [mailto:rahul.rec.dgp@gmail.com] 
Sent: 03 April 2013 15:40
To: user@hadoop.apache.org
Subject: NameNode failure and recovery!


Hi all,

I was reading about Hadoop and got to know that there are two ways to protect against the
name node failures.

1) To write to a nfs mount along with the usual local disk.


2) Use secondary name node. In case of failure of NN , the SNN can take in charge. 

My questions :-

1) SNN is always lagging , so when SNN becomes primary in event of a NN failure ,  then the
edits which have not been merged into the image file would be lost , so the system of SNN
would not be consistent with the NN before its failure.

2) Also I have read that other purpose of SNN is to periodically merge the edit logs with
the image file. In case a setup goes with option #1 (writing to NFS, no SNN) , then who does
this merging.




View raw message