hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "NameNodeFailover" by TedDunning
Date Fri, 20 Jul 2007 17:06:23 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by TedDunning:
http://wiki.apache.org/lucene-hadoop/NameNodeFailover

New page:
The name node is a critical resource for the cluster because data nodes don't know enough
about the blocks that they contain to coherently answer requests for anything but the block
contents.  This isn't generally a serious problem because single machines are typically fairly
reliable (it is only with a large cluster that we expect daily or hourly failures).

That said, there is a secondary name node that talks to the primary name node on a regular
basis in order to keep track of the files in the system.  It does this by copying the fsimage
and editlog files from the primary name node.

If the name node dies, the best procedure is to simply use DNS to rename the primary and secondary
name nodes.  The secondary name node will serve as primary name node as long as nodes request
meta-data from it.  Once you get your old primary back up, you should reconfigure it to be
the secondary name node and you will be back in full operation.

Questions I still have include:

* what do you have to do to the old primary to make it be a secondary?

* can you have more than one secondary name node (for off-site backup purposes)?

* are there plans for distributing the name node function?  

Mime
View raw message