hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@apache.org>
Subject Re: SecondaryNameNode on separate machine
Date Wed, 29 Oct 2008 00:14:44 GMT

Contrary to popular belief the secondary namenode does not provide failover,
it's only used to do what is described here :

So the term "secondary" does not mean "a second one" but is more like "a
second part of".


On Tue, Oct 28, 2008 at 9:44 AM, Tomislav Poljak <tpoljak@gmail.com> wrote:

> Hi,
> I'm trying to implement NameNode failover (or at least NameNode local
> data backup), but it is hard since there is no official documentation.
> Pages on this subject are created, but still empty:
> http://wiki.apache.org/hadoop/NameNodeFailover
> http://wiki.apache.org/hadoop/SecondaryNameNode
> I have been browsing the web and hadoop mailing list to see how this
> should be implemented, but I got even more confused. People are asking
> do we even need SecondaryNameNode etc. (since NameNode can write local
> data to multiple locations, so one of those locations can be a mounted
> disk from other machine). I think I understand the motivation for
> SecondaryNameNode (to create a snapshoot of NameNode data every n
> seconds/hours), but setting (deploying and running) SecondaryNameNode on
> different machine than NameNode is not as trivial as I expected. First I
> found that if I need to run SecondaryNameNode on other machine than
> NameNode I should change masters file on NameNode (change localhost to
> SecondaryNameNode host) and set some properties in hadoop-site.xml on
> SecondaryNameNode (fs.default.name, fs.checkpoint.dir,
> fs.checkpoint.period etc.)
> This was enough to start SecondaryNameNode when starting NameNode with
> bin/start-dfs.sh , but it didn't create image on SecondaryNameNode. Then
> I found that I need to set dfs.http.address on NameNode address (so now
> I have NameNode address in both fs.default.name and dfs.http.address).
> Now I get following exception:
> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary - Exception in
> doCheckpoint:
> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary -
> java.net.SocketException: Unexpected end of file from server
> My questions are following:
> How to resolve this problem (this exception)?
> Do I need additional property in SecondaryNameNode's hadoop-site.xml or
> NameNode's hadoop-site.xml?
> How should NameNode failover work ideally? Is it like this:
> SecondaryNameNode runs on separate machine than NameNode and stores
> NameNode's data (fsimage and fsiedits) locally in fs.checkpoint.dir.
> When NameNode machine crashes, we start NameNode on machine where
> SecondaryNameNode was running and we set dfs.name.dir to
> fs.checkpoint.dir. Also we need to change how DNS resolves NameNode
> hostname (change from the primary to the secondary).
> Is this correct ?
> Tomislav

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message