hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: question about ZKFC daemon
Date Tue, 15 Jan 2013 10:11:34 GMT
No, ZooKeeper daemons == http://zookeeper.apache.org.


On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <esggrupos@gmail.com> wrote:

> Hi Harsh,
>
> Now I´m confussed at all :-))))
>
> as you pointed ZKFC runs only in the NN. That´s looks right.
>
> So, what are ZK peers (the odd number I´m looking for) and where I have to
> run them? on another 3 nodes?
>
> As I can read from the previous url:
>
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes. Since ZooKeeper itself has light resource requirements, it
> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
> HDFS NameNode and Standby Node. Many operators choose to deploy the third
> ZooKeeper process on the same node as the YARN ResourceManager. It is
> advisable to configure the ZooKeeper nodes to store their data on separate
> disk drives from the HDFS metadata for best performance and isolation.
>
> Here,  ZooKeeper daemons = ZKFC?
>
>
> Thanks
>
> ESGLinux,
>
>
>
> 2013/1/15 Harsh J <harsh@cloudera.com>
>
>> Hi,
>>
>> I fail to see your confusion.
>>
>> ZKFC != ZK
>>
>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>> numbers, such as JNs are to be.
>>
>> ZKFC is something the NN needs for its Automatic Failover capability. It
>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>
>>
>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <esggrupos@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I´m only testing the new HA feature. I´m not in a production system,
>>>
>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>
>>> In this url:
>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>
>>> you can read:
>>> If you have configured automatic failover using the ZooKeeper
>>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>>> each of the machines that runs a NameNode.
>>>
>>> So, the number of ZKFC daemons are two, but reading this url:
>>>
>>>
>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>
>>> you can read this:
>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>> three or five nodes
>>>
>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>> (active+standby).
>>>
>>> So I´m a bit confussed with this deployment...
>>>
>>> Any suggestion?
>>>
>>> Thanks in advance for all your answers
>>>
>>> Kind regards,
>>>
>>> ESGLinux
>>>
>>>
>>>
>>>
>>> 2013/1/14 Colin McCabe <cmccabe@alumni.cmu.edu>
>>>
>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cmccabe@alumni.cmu.edu>
>>>> wrote:
>>>> > Hi ESGLinux,
>>>> >
>>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>> > if you like, though.
>>>>
>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>> active NN node and the standby NN node.
>>>>
>>>> Colin
>>>>
>>>> >
>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>> > you just want to try something out, you can run everything on the same
>>>> > node if you want.  It depends on what you're trying to do.
>>>> >
>>>> > cheers,
>>>> > Colin
>>>> >
>>>> >
>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <esggrupos@gmail.com>
>>>> wrote:
>>>> >> Thank you for your answer Craig,
>>>> >>
>>>> >> I´m planning my cluster and for now I´m not sure how many machines
I
>>>> need;-)
>>>> >>
>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>> where to
>>>> >> ask for explications :-)
>>>> >>
>>>> >> ESGLinux
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2012/12/28 Craig Munro <craig.munro@gmail.com>
>>>> >>>
>>>> >>> OK, I have reliable storage on my datanodes so not an issue
for me.
>>>>  If
>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>> >>>
>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <esggrupos@gmail.com>
wrote:
>>>> >>>>
>>>> >>>> Hi Craig,
>>>> >>>>
>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>> >>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>> >>>>
>>>> >>>> The JournalNode daemon is relatively lightweight, so these
daemons
>>>> can
>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>> for example
>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons
on the
>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>> etc.) so the
>>>> >>>> JournalNodes' local directories can use the reliable local
storage
>>>> on those
>>>> >>>> machines.
>>>> >>>> There must be at least three JournalNode daemons, since
edit log
>>>> >>>> modifications must be written to a majority of JournalNodes
>>>> >>>>
>>>> >>>> as you can read they recommend to put journalnode daemons
with the
>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>> >>>>
>>>> >>>>
>>>> >>>> Thanks for your answer,
>>>> >>>>
>>>> >>>> ESGLinux,
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2012/12/28 Craig Munro <craig.munro@gmail.com>
>>>> >>>>>
>>>> >>>>> You need the following:
>>>> >>>>>
>>>> >>>>> - active namenode + zkfc
>>>> >>>>> - standby namenode + zkfc
>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>> >>>>>
>>>> >>>>> As the journal nodes hold the namesystem transactions
they should
>>>> not be
>>>> >>>>> co-located with the namenodes in case of failure.  I
distribute
>>>> the journal
>>>> >>>>> and zookeeper nodes across the hosts running datanodes
or as
>>>> Harsh says you
>>>> >>>>> could co-locate them on dedicated hosts.
>>>> >>>>>
>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Craig
>>>> >>>>>
>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <esggrupos@gmail.com>
wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> well, If I have understand you I can configure my
NN HA cluster
>>>> this
>>>> >>>>>> way:
>>>> >>>>>>
>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>> >>>>>>
>>>> >>>>>> Is this right?
>>>> >>>>>>
>>>> >>>>>> Thanks in advance,
>>>> >>>>>>
>>>> >>>>>> ESGLinux,
>>>> >>>>>>
>>>> >>>>>> 2012/12/27 Harsh J <harsh@cloudera.com>
>>>> >>>>>>>
>>>> >>>>>>> Hi,
>>>> >>>>>>>
>>>> >>>>>>> There are two different things here: Automatic
Failover and
>>>> Quorum
>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper
Failover
>>>> Controller,
>>>> >>>>>>> is to manage failovers automatically (based
on health checks of
>>>> NNs).
>>>> >>>>>>> The latter, used via a set of Journal Nodes,
is a medium of
>>>> shared
>>>> >>>>>>> storage for namesystem transactions that helps
enable HA.
>>>> >>>>>>>
>>>> >>>>>>> In a typical deployment, you want 3 or more
(odd) JournalNodes
>>>> for
>>>> >>>>>>> reliable HA, preferably on nodes of their own
if possible (like
>>>> you
>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate
with those
>>>> as
>>>> >>>>>>> well) and one ZKFC for each NameNode (connected
to the same ZK
>>>> >>>>>>> quorum).
>>>> >>>>>>>
>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <esggrupos@gmail.com>
>>>> wrote:
>>>> >>>>>>> > Hi all,
>>>> >>>>>>> >
>>>> >>>>>>> > I have a doubt about how to deploy the
Zookeeper in a NN HA
>>>> >>>>>>> > cluster,
>>>> >>>>>>> >
>>>> >>>>>>> > As far as I know, I need at least three
nodes to run three
>>>> ZooKeeper
>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put
these 3 daemons
>>>> this way:
>>>> >>>>>>> >
>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is
this right?)
>>>> >>>>>>> >
>>>> >>>>>>> > so the quorum is formed with these three
nodes. The nodes
>>>> that runs
>>>> >>>>>>> > a
>>>> >>>>>>> > namenode are right because the ZKFC monitors
it, but what
>>>> does the
>>>> >>>>>>> > third
>>>> >>>>>>> > daemon?
>>>> >>>>>>> >
>>>> >>>>>>> > as I read from this url:
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>> >>>>>>> >
>>>> >>>>>>> > this daemons are only related with NameNodes,
(Health
>>>> monitoring -
>>>> >>>>>>> > the ZKFC
>>>> >>>>>>> > pings its local NameNode on a periodic
basis with a
>>>> health-check
>>>> >>>>>>> > command.)
>>>> >>>>>>> > so what does the third ZKFC? I used the
jobtracker node but I
>>>> could
>>>> >>>>>>> > use
>>>> >>>>>>> > another node without any daemon on it...
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks in advance,
>>>> >>>>>>> >
>>>> >>>>>>> > ESGLInux,
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --
>>>> >>>>>>> Harsh J
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>
>>>> >>
>>>>
>>>
>>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Harsh J

Mime
View raw message