hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ESGLinux <esggru...@gmail.com>
Subject Re: question about ZKFC daemon
Date Tue, 15 Jan 2013 09:53:21 GMT
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux




2013/1/14 Colin McCabe <cmccabe@alumni.cmu.edu>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cmccabe@alumni.cmu.edu>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <esggrupos@gmail.com> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <craig.munro@gmail.com>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <esggrupos@gmail.com> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <craig.munro@gmail.com>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute
the
> journal
> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
> says you
> >>>>> could co-locate them on dedicated hosts.
> >>>>>
> >>>>> ZKFC does not monitor the JobTracker.
> >>>>>
> >>>>> Regards,
> >>>>> Craig
> >>>>>
> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <esggrupos@gmail.com>
wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> well, If I have understand you I can configure my NN HA cluster
this
> >>>>>> way:
> >>>>>>
> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
> >>>>>>
> >>>>>> Is this right?
> >>>>>>
> >>>>>> Thanks in advance,
> >>>>>>
> >>>>>> ESGLinux,
> >>>>>>
> >>>>>> 2012/12/27 Harsh J <harsh@cloudera.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There are two different things here: Automatic Failover
and Quorum
> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
> Controller,
> >>>>>>> is to manage failovers automatically (based on health checks
of
> NNs).
> >>>>>>> The latter, used via a set of Journal Nodes, is a medium
of shared
> >>>>>>> storage for namesystem transactions that helps enable HA.
> >>>>>>>
> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
for
> >>>>>>> reliable HA, preferably on nodes of their own if possible
(like you
> >>>>>>> would for typical ZooKeepers, and you may co-locate with
those as
> >>>>>>> well) and one ZKFC for each NameNode (connected to the same
ZK
> >>>>>>> quorum).
> >>>>>>>
> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <esggrupos@gmail.com>
> wrote:
> >>>>>>> > Hi all,
> >>>>>>> >
> >>>>>>> > I have a doubt about how to deploy the Zookeeper in
a NN HA
> >>>>>>> > cluster,
> >>>>>>> >
> >>>>>>> > As far as I know, I need at least three nodes to run
three
> ZooKeeper
> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
this
> way:
> >>>>>>> >
> >>>>>>> > - Active NameNode + 1 ZKFC daemon
> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
> >>>>>>> >
> >>>>>>> > so the quorum is formed with these three nodes. The
nodes that
> runs
> >>>>>>> > a
> >>>>>>> > namenode are right because the ZKFC monitors it, but
what does
> the
> >>>>>>> > third
> >>>>>>> > daemon?
> >>>>>>> >
> >>>>>>> > as I read from this url:
> >>>>>>> >
> >>>>>>> >
> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
> >>>>>>> >
> >>>>>>> > this daemons are only related with NameNodes, (Health
monitoring
> -
> >>>>>>> > the ZKFC
> >>>>>>> > pings its local NameNode on a periodic basis with a
health-check
> >>>>>>> > command.)
> >>>>>>> > so what does the third ZKFC? I used the jobtracker
node but I
> could
> >>>>>>> > use
> >>>>>>> > another node without any daemon on it...
> >>>>>>> >
> >>>>>>> > Thanks in advance,
> >>>>>>> >
> >>>>>>> > ESGLInux,
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Harsh J
> >>>>>>
> >>>>>>
> >>>>
> >>
>

Mime
View raw message