hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ESGLinux <esggru...@gmail.com>
Subject Re: question about ZKFC daemon
Date Tue, 15 Jan 2013 10:17:45 GMT
ok,

Thats the origin of my confussion, I thought they were the same.
I´m going to read this doc to bring me a bit of light about ZooKeeper..

thank you very much for your help,

ESGLinux,



2013/1/15 Harsh J <harsh@cloudera.com>

> No, ZooKeeper daemons == http://zookeeper.apache.org.
>
>
> On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <esggrupos@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Now I´m confussed at all :-))))
>>
>> as you pointed ZKFC runs only in the NN. That´s looks right.
>>
>> So, what are ZK peers (the odd number I´m looking for) and where I have
>> to run them? on another 3 nodes?
>>
>> As I can read from the previous url:
>>
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes. Since ZooKeeper itself has light resource requirements, it
>> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
>> HDFS NameNode and Standby Node. Many operators choose to deploy the third
>> ZooKeeper process on the same node as the YARN ResourceManager. It is
>> advisable to configure the ZooKeeper nodes to store their data on separate
>> disk drives from the HDFS metadata for best performance and isolation.
>>
>> Here,  ZooKeeper daemons = ZKFC?
>>
>>
>> Thanks
>>
>> ESGLinux,
>>
>>
>>
>> 2013/1/15 Harsh J <harsh@cloudera.com>
>>
>>> Hi,
>>>
>>> I fail to see your confusion.
>>>
>>> ZKFC != ZK
>>>
>>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>>> numbers, such as JNs are to be.
>>>
>>> ZKFC is something the NN needs for its Automatic Failover capability. It
>>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>>
>>>
>>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <esggrupos@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I´m only testing the new HA feature. I´m not in a production system,
>>>>
>>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>>
>>>> In this url:
>>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>>
>>>> you can read:
>>>> If you have configured automatic failover using the ZooKeeper
>>>> FailoverController (ZKFC), you must install and start thezkfc daemon
>>>> on
>>>> each of the machines that runs a NameNode.
>>>>
>>>> So, the number of ZKFC daemons are two, but reading this url:
>>>>
>>>>
>>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>>
>>>> you can read this:
>>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>>> three or five nodes
>>>>
>>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>>> odd number of nodes to avoid split-brain. The problem I see here is that
If
>>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>>> (active+standby).
>>>>
>>>> So I´m a bit confussed with this deployment...
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks in advance for all your answers
>>>>
>>>> Kind regards,
>>>>
>>>> ESGLinux
>>>>
>>>>
>>>>
>>>>
>>>> 2013/1/14 Colin McCabe <cmccabe@alumni.cmu.edu>
>>>>
>>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cmccabe@alumni.cmu.edu>
>>>>> wrote:
>>>>> > Hi ESGLinux,
>>>>> >
>>>>> > In production, you need to run QJM on at least 3 nodes.  You also
>>>>> need
>>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>>> > if you like, though.
>>>>>
>>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>>> active NN node and the standby NN node.
>>>>>
>>>>> Colin
>>>>>
>>>>> >
>>>>> > Of course, none of this is "needed" to set up an example cluster.
 If
>>>>> > you just want to try something out, you can run everything on the
>>>>> same
>>>>> > node if you want.  It depends on what you're trying to do.
>>>>> >
>>>>> > cheers,
>>>>> > Colin
>>>>> >
>>>>> >
>>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <esggrupos@gmail.com>
>>>>> wrote:
>>>>> >> Thank you for your answer Craig,
>>>>> >>
>>>>> >> I´m planning my cluster and for now I´m not sure how many
machines
>>>>> I need;-)
>>>>> >>
>>>>> >> If I have doubt i´ll what clouder say and If have a problem
I have
>>>>> where to
>>>>> >> ask for explications :-)
>>>>> >>
>>>>> >> ESGLinux
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2012/12/28 Craig Munro <craig.munro@gmail.com>
>>>>> >>>
>>>>> >>> OK, I have reliable storage on my datanodes so not an issue
for
>>>>> me.  If
>>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>>> >>>
>>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <esggrupos@gmail.com>
wrote:
>>>>> >>>>
>>>>> >>>> Hi Craig,
>>>>> >>>>
>>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>>> >>>>
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>> >>>>
>>>>> >>>> The JournalNode daemon is relatively lightweight, so
these
>>>>> daemons can
>>>>> >>>> reasonably be collocated on machines with other Hadoop
daemons,
>>>>> for example
>>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>>> >>>> Cloudera recommends that you deploy the JournalNode
daemons on the
>>>>> >>>> "master" host or hosts (NameNode, Standby NameNode,
JobTracker,
>>>>> etc.) so the
>>>>> >>>> JournalNodes' local directories can use the reliable
local
>>>>> storage on those
>>>>> >>>> machines.
>>>>> >>>> There must be at least three JournalNode daemons, since
edit log
>>>>> >>>> modifications must be written to a majority of JournalNodes
>>>>> >>>>
>>>>> >>>> as you can read they recommend to put journalnode daemons
with the
>>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks for your answer,
>>>>> >>>>
>>>>> >>>> ESGLinux,
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2012/12/28 Craig Munro <craig.munro@gmail.com>
>>>>> >>>>>
>>>>> >>>>> You need the following:
>>>>> >>>>>
>>>>> >>>>> - active namenode + zkfc
>>>>> >>>>> - standby namenode + zkfc
>>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>> >>>>>
>>>>> >>>>> As the journal nodes hold the namesystem transactions
they
>>>>> should not be
>>>>> >>>>> co-located with the namenodes in case of failure.
 I distribute
>>>>> the journal
>>>>> >>>>> and zookeeper nodes across the hosts running datanodes
or as
>>>>> Harsh says you
>>>>> >>>>> could co-locate them on dedicated hosts.
>>>>> >>>>>
>>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>>> >>>>>
>>>>> >>>>> Regards,
>>>>> >>>>> Craig
>>>>> >>>>>
>>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <esggrupos@gmail.com>
wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> well, If I have understand you I can configure
my NN HA cluster
>>>>> this
>>>>> >>>>>> way:
>>>>> >>>>>>
>>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal
Node
>>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal
Node
>>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal
Node,
>>>>> >>>>>>
>>>>> >>>>>> Is this right?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance,
>>>>> >>>>>>
>>>>> >>>>>> ESGLinux,
>>>>> >>>>>>
>>>>> >>>>>> 2012/12/27 Harsh J <harsh@cloudera.com>
>>>>> >>>>>>>
>>>>> >>>>>>> Hi,
>>>>> >>>>>>>
>>>>> >>>>>>> There are two different things here: Automatic
Failover and
>>>>> Quorum
>>>>> >>>>>>> Journal Manager. The former, used via a
ZooKeeper Failover
>>>>> Controller,
>>>>> >>>>>>> is to manage failovers automatically (based
on health checks
>>>>> of NNs).
>>>>> >>>>>>> The latter, used via a set of Journal Nodes,
is a medium of
>>>>> shared
>>>>> >>>>>>> storage for namesystem transactions that
helps enable HA.
>>>>> >>>>>>>
>>>>> >>>>>>> In a typical deployment, you want 3 or more
(odd) JournalNodes
>>>>> for
>>>>> >>>>>>> reliable HA, preferably on nodes of their
own if possible
>>>>> (like you
>>>>> >>>>>>> would for typical ZooKeepers, and you may
co-locate with those
>>>>> as
>>>>> >>>>>>> well) and one ZKFC for each NameNode (connected
to the same ZK
>>>>> >>>>>>> quorum).
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux
<esggrupos@gmail.com>
>>>>> wrote:
>>>>> >>>>>>> > Hi all,
>>>>> >>>>>>> >
>>>>> >>>>>>> > I have a doubt about how to deploy
the Zookeeper in a NN HA
>>>>> >>>>>>> > cluster,
>>>>> >>>>>>> >
>>>>> >>>>>>> > As far as I know, I need at least three
nodes to run three
>>>>> ZooKeeper
>>>>> >>>>>>> > FailOver Controller (ZKFC). I plan
to put these 3 daemons
>>>>> this way:
>>>>> >>>>>>> >
>>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon,
(is this right?)
>>>>> >>>>>>> >
>>>>> >>>>>>> > so the quorum is formed with these
three nodes. The nodes
>>>>> that runs
>>>>> >>>>>>> > a
>>>>> >>>>>>> > namenode are right because the ZKFC
monitors it, but what
>>>>> does the
>>>>> >>>>>>> > third
>>>>> >>>>>>> > daemon?
>>>>> >>>>>>> >
>>>>> >>>>>>> > as I read from this url:
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>> >>>>>>> >
>>>>> >>>>>>> > this daemons are only related with
NameNodes, (Health
>>>>> monitoring -
>>>>> >>>>>>> > the ZKFC
>>>>> >>>>>>> > pings its local NameNode on a periodic
basis with a
>>>>> health-check
>>>>> >>>>>>> > command.)
>>>>> >>>>>>> > so what does the third ZKFC? I used
the jobtracker node but
>>>>> I could
>>>>> >>>>>>> > use
>>>>> >>>>>>> > another node without any daemon on
it...
>>>>> >>>>>>> >
>>>>> >>>>>>> > Thanks in advance,
>>>>> >>>>>>> >
>>>>> >>>>>>> > ESGLInux,
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> --
>>>>> >>>>>>> Harsh J
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Harsh J
>

Mime
View raw message