Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 49FD0E0E2 for ; Fri, 28 Dec 2012 10:09:19 +0000 (UTC) Received: (qmail 75538 invoked by uid 500); 28 Dec 2012 10:09:14 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 75452 invoked by uid 500); 28 Dec 2012 10:09:14 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 75438 invoked by uid 99); 28 Dec 2012 10:09:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Dec 2012 10:09:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of craig.munro@gmail.com designates 209.85.214.45 as permitted sender) Received: from [209.85.214.45] (HELO mail-bk0-f45.google.com) (209.85.214.45) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Dec 2012 10:09:07 +0000 Received: by mail-bk0-f45.google.com with SMTP id jk13so4542148bkc.4 for ; Fri, 28 Dec 2012 02:08:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=UHIbcQkwPKDSYndfZBU/60oEjbOo7wEPk/x4r/7skjk=; b=z67FPvo8Lh/JEaT4cX4gyGSZ3VViZiukN1rhzbHw2/f61VyO4XCO3l8Fdbv44b7LfW IgcPDeDDnk2bfJVvrmozy/cuWfHw2BtSggd3LfcKntJuflSQwUvMjNvAtlFna8Dc5OfV bzOsEcSzpAD25iqJYLZWz28LEqVLeqJf4yCn3txUolQGKAxQiwHB+niqW6+H55+rw6/b SC1+W4CkxHKHIrTas9Ne8rAoS6AUy/dls3ArTmFyMu8zl4X3dw1qKszNSowDA9E9Xyhj XiDJ0LvhW7ZWRdin2iWNWSMMF14kg1L1laKEDIuekwPZspfNAJxm3k6oahwP0sqbXWp/ f16A== MIME-Version: 1.0 Received: by 10.204.5.141 with SMTP id 13mr15818494bkv.35.1356689327012; Fri, 28 Dec 2012 02:08:47 -0800 (PST) Received: by 10.204.64.140 with HTTP; Fri, 28 Dec 2012 02:08:46 -0800 (PST) Received: by 10.204.64.140 with HTTP; Fri, 28 Dec 2012 02:08:46 -0800 (PST) In-Reply-To: References: Date: Fri, 28 Dec 2012 10:08:46 +0000 Message-ID: Subject: Re: question about ZKFC daemon From: Craig Munro To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0015175907a46839e404d1e6d9a8 X-Virus-Checked: Checked by ClamAV on apache.org --0015175907a46839e404d1e6d9a8 Content-Type: text/plain; charset=ISO-8859-1 You need the following: - active namenode + zkfc - standby namenode + zkfc - pool of journal nodes (odd number, 3 or more) - pool of zookeeper nodes (odd number, 3 or more) As the journal nodes hold the namesystem transactions they should not be co-located with the namenodes in case of failure. I distribute the journal and zookeeper nodes across the hosts running datanodes or as Harsh says you could co-locate them on dedicated hosts. ZKFC does not monitor the JobTracker. Regards, Craig On Dec 28, 2012 9:25 AM, "ESGLinux" wrote: > Hi, > > well, If I have understand you I can configure my NN HA cluster this way: > > - Active NameNode + 1 ZKFC daemon + Journal Node > - Standby NameNode + 1 ZKFC daemon + Journal Node > - JobTracker node + 1 ZKFC daemon + Journal Node, > > Is this right? > > Thanks in advance, > > ESGLinux, > > 2012/12/27 Harsh J > >> Hi, >> >> There are two different things here: Automatic Failover and Quorum >> Journal Manager. The former, used via a ZooKeeper Failover Controller, >> is to manage failovers automatically (based on health checks of NNs). >> The latter, used via a set of Journal Nodes, is a medium of shared >> storage for namesystem transactions that helps enable HA. >> >> In a typical deployment, you want 3 or more (odd) JournalNodes for >> reliable HA, preferably on nodes of their own if possible (like you >> would for typical ZooKeepers, and you may co-locate with those as >> well) and one ZKFC for each NameNode (connected to the same ZK >> quorum). >> >> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux wrote: >> > Hi all, >> > >> > I have a doubt about how to deploy the Zookeeper in a NN HA cluster, >> > >> > As far as I know, I need at least three nodes to run three ZooKeeper >> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way: >> > >> > - Active NameNode + 1 ZKFC daemon >> > - Standby NameNode + 1 ZKFC daemon >> > - JobTracker node + 1 ZKFC daemon, (is this right?) >> > >> > so the quorum is formed with these three nodes. The nodes that runs a >> > namenode are right because the ZKFC monitors it, but what does the third >> > daemon? >> > >> > as I read from this url: >> > >> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration >> > >> > this daemons are only related with NameNodes, (Health monitoring - the >> ZKFC >> > pings its local NameNode on a periodic basis with a health-check >> command.) >> > so what does the third ZKFC? I used the jobtracker node but I could use >> > another node without any daemon on it... >> > >> > Thanks in advance, >> > >> > ESGLInux, >> > >> > >> > >> >> >> >> -- >> Harsh J >> > > --0015175907a46839e404d1e6d9a8 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

You need the following:

- active namenode + zkfc
- standby namenode + zkfc
- pool of journal nodes (odd number, 3 or more)
- pool of zookeeper nodes (odd number, 3 or more)

As the journal nodes hold the namesystem transactions they should not be= co-located with the namenodes in case of failure.=A0 I distribute the jour= nal and zookeeper nodes across the hosts running datanodes or as Harsh says= you could co-locate them on dedicated hosts.

ZKFC does not monitor the JobTracker.

Regards,
Craig

On Dec 28, 2012 9:25 AM, "ESGLinux" &l= t;esggrupos@gmail.com> wrote:=
Hi,=A0

well, If I have understand you I can configure my= NN HA cluster this way:

- Active NameNode + 1 ZKFC daemon + Journal Node=A0
- Standby Nam= eNode + 1 ZKFC daemon + Journal Node
- JobTracker node + 1 ZKFC d= aemon + Journal Node,=A0

Is this right?

Thanks in advance,=A0

ESGLinux,=A0<= /div>
2012/12/27 Harsh J <harsh@clou= dera.com>
Hi,

There are two different things here: Automatic Failover and Quorum
Journal Manager. The former, used via a ZooKeeper Failover Controller,
is to manage failovers automatically (based on health checks of NNs).
The latter, used via a set of Journal Nodes, is a medium of shared
storage for namesystem transactions that helps enable HA.

In a typical deployment, you want 3 or more (odd) JournalNodes for
reliable HA, preferably on nodes of their own if possible (like you
would for typical ZooKeepers, and you may co-locate with those as
well) and one ZKFC for each NameNode (connected to the same ZK
quorum).

On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <esggrupos@gmail.com> wrote:
> Hi all,
>
> I have a doubt about how to deploy the Zookeeper in a NN HA =A0cluster= ,
>
> As far as I know, I need at least three nodes to run three ZooKeeper > FailOver Controller (ZKFC). I plan to put these 3 daemons this way: >
> - Active NameNode + 1 ZKFC daemon
> - Standby NameNode + 1 ZKFC daemon
> - JobTracker node + 1 ZKFC daemon, (is this right?)
>
> so the quorum is formed with these three nodes. The nodes that runs a<= br> > namenode are right because the ZKFC monitors it, but what does the thi= rd
> daemon?
>
> as I read from this url:
> https://ccp.cloudera.com/d= isplay/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConf= igurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>
> this daemons are only related with NameNodes, (Health monitoring - the= ZKFC
> pings its local NameNode on a periodic basis with a health-check comma= nd.)
> so what does the third ZKFC? I used the jobtracker node but I could us= e
> another node without any daemon on it...
>
> Thanks in advance,
>
> ESGLInux,
>
>
>



--
Harsh J

--0015175907a46839e404d1e6d9a8--