Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C1DD103FA for ; Wed, 13 Nov 2013 23:43:34 +0000 (UTC) Received: (qmail 86779 invoked by uid 500); 13 Nov 2013 23:43:33 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 86751 invoked by uid 500); 13 Nov 2013 23:43:33 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 86743 invoked by uid 99); 13 Nov 2013 23:43:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Nov 2013 23:43:33 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of texpilot@gmail.com designates 74.125.83.54 as permitted sender) Received: from [74.125.83.54] (HELO mail-ee0-f54.google.com) (74.125.83.54) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Nov 2013 23:43:27 +0000 Received: by mail-ee0-f54.google.com with SMTP id e51so304049eek.27 for ; Wed, 13 Nov 2013 15:43:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=a0xD1fFvwYPDTL5r2gStEKAnGGLWY6hoPv6piJQIb9c=; b=rp58AcNkxlAeGCo/Ylx8HSjl7pe8eDJMx9Rcmb1h1aVbyvwoX1BHOd5qBgdW+XRgDt GD+mKlQQg+RES2+VyKwsClS+FsMftpNB6lTzLRvR7gtJQ5UTZ27mbf9SNla9bn9rTyid v/WlHofaDV5qrF7EiG6VvATw9Xv6/llo/c4AEwXBRBwqal1WJgfQzOe/luZ24WEnKt01 bw/5j4ZJp8EJCBoLNqxW0n7wcchcCO6WmoJ/ZZiFKvUhhAdwCjHOMkhoo607tR+u4F3R p1yz0cKaKy0mAqn1P4PcjIZvCAr6N5MEIh/lL4O9u6vOAP048aG75Bzra8QisQrNGe8e sFvA== MIME-Version: 1.0 X-Received: by 10.14.127.4 with SMTP id c4mr868265eei.144.1384386187450; Wed, 13 Nov 2013 15:43:07 -0800 (PST) Received: by 10.223.156.3 with HTTP; Wed, 13 Nov 2013 15:43:07 -0800 (PST) In-Reply-To: References: <52795A6C.40800@gmail.com> Date: Wed, 13 Nov 2013 17:43:07 -0600 Message-ID: Subject: Re: Accumulo Standby Master question From: "Terry P." To: "user@accumulo.apache.org" Content-Type: multipart/alternative; boundary=001a11c3a5f0ef5f8a04eb17860e X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3a5f0ef5f8a04eb17860e Content-Type: text/plain; charset=UTF-8 Correction: just did an Accumulo cluster shutdown and the gc on the secondary namenode did not shutdown along with the rest of the cluster. No errors on the Master when stop-all.sh was called. No time to research right now though, but wanted to let everyone know. On Wed, Nov 13, 2013 at 4:29 PM, Terry P. wrote: > I was able to get to the 1.5.0 source, and start-here.sh has the same > problem as described above for 1.4.2. > > > On Wed, Nov 13, 2013 at 4:12 PM, Terry P. wrote: > >> Revisiting this, looks like I found even when I have my secondary >> namenode hostname in the gc file on the master and secondary namenode (my >> standby master), the gc process would not start on any node but the master. >> >> Looking at the config.sh script on lines 102-104: >> >> if [ -f "$ACCUMULO_HOME/conf/gc" ]; then >> GC=`grep -v '^#' "$ACCUMULO_HOME/conf/gc" | head -1` >> fi >> >> That only puts the first node found in the GC variable, similar to what >> is done with MASTER. >> >> Then in start-here.sh, whereas the masters file is checked to see if the >> current host is in the masters file (starting at line 38): >> >> for host in $HOSTS >> do >> *if grep -q "^${host}\$" $ACCUMULO_HOME/conf/masters* >> then >> ${bin}/accumulo >> org.apache.accumulo.server.master.state.SetGoalState NORMAL >> ${bin}/start-server.sh $host master >> break >> fi >> done >> >> *For GC, only the ${GC} variable is checked (starting on line 48), and >> start of the gc is only attempted on the one host in the GC variable*: >> >> for host in $HOSTS >> do >> *if [ ${host} = ${GC} ]* >> then >> ${bin}/start-server.sh *$GC* gc "garbage collector" >> break >> fi >> done >> >> *I've modified my start-here.sh script's GC section to*: >> >> for host in $HOSTS >> do >> *if grep -q "^${host}\$" $ACCUMULO_HOME/conf/gc* >> then >> ${bin}/start-server.sh *${host}* gc "garbage collector" >> break >> fi >> done >> >> And that did the trick. I didn't see any changes needed in the >> stop-here.sh script, as it already iterates over all possible processes to >> stop them if they are found. >> >> Can anyone think of anything I might have missed? >> >> I searched for any JIRAs on this but did not find any. >> >> I tried to download the 1.5.0 binaries, but download link doesn't work >> for me (has happened before with my work's overly restrictive internet >> gateway) -- but FYI I asked my brother to download it, and the "mirrors" >> link under GENERIC BINARIES didn't work for his as well. >> >> If I can get my hands on the 1.5.0 binaries, I'll check to see if this is >> already OBE in 1.5.0 today else it will have to wait until tonight. >> >> >> >> On Wed, Nov 6, 2013 at 12:12 PM, Terry P. wrote: >> >>> Ahh thanks Billie, I'll stick with just one monitor then. Thanks! >>> >>> >>> On Wed, Nov 6, 2013 at 10:32 AM, Billie Rinaldi < >>> billie.rinaldi@gmail.com> wrote: >>> >>>> On Tue, Nov 5, 2013 at 12:51 PM, Josh Elser wrote: >>>> >>>>> It looks like you can configure multiple hosts for GC and they'll use >>>>> ZooKeeper to perform failover (like the master). >>>>> >>>>> Tracers -- You can run multiple tracer processes. You likely don't >>>>> need 1:1 as you run tservers, but you can run a few if you're concerned >>>>> about it. They're not required for Accumulo operation. >>>>> >>>>> Same for the monitor. If you need to have multiple running for >>>>> failover purposes, it looks like you can specify multiple and it will just >>>>> launch a monitor on each host you specified. There's no centralize URL you >>>>> can always hit here. You would have to check each one to find one that was, >>>>> unless you want to run some sort of reverse-proxy in front of them all. >>>>> >>>> >>>> I think additional monitors won't work entirely as expected. The log >>>> forwarding from the other processes is set up when the processes are >>>> started, and the logs are only sent to the first host in the monitor file. >>>> >>>> >>>>> _obligatory I just looked to source code to come to this conclusion_ >>>>> >>>>> >>>>> On 11/5/13, 3:13 PM, Terry P. wrote: >>>>> >>>>>> I put my secondary namenode in the masters file for the first time >>>>>> with >>>>>> this latest Accumulo 1.4.2 cluster deployment so it would run as a >>>>>> Standby Accumulo Master which I read about in the User Guide. >>>>>> Recently >>>>>> had the Master lose its Zookeeper lock (network glitch, being >>>>>> researched), and was glad to see the secondary namenode Master process >>>>>> took over as it should. >>>>>> >>>>>> But if my Namenode / Accumulo Master server goes down, I also lose the >>>>>> gc, monitor, and tracer processes as well. *Can I configure the >>>>>> >>>>>> secondary namenode in gc, tracer, and monitor files as well, or should >>>>>> they run on only one host at a time*? In this case, the GC also lost >>>>>> >>>>>> its Zookeeper lock, which resulted in a cluster with no GC running at >>>>>> all until I caught it. >>>>>> >>>>>> Thanks in advance. >>>>>> >>>>> >>>> >>> >> > --001a11c3a5f0ef5f8a04eb17860e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Correction: just did an Accumulo cluster shutdown and= the gc on the secondary namenode did not shutdown along with the rest of t= he cluster. No errors on the Master when stop-all.sh was called.

No time to research right now though, but wanted to let everyone know.
<= /div>


On Wed, = Nov 13, 2013 at 4:29 PM, Terry P. <texpilot@gmail.com> wrot= e:
I was able to get to the 1.= 5.0 source, and start-here.sh has the same problem as described above for 1= .4.2.

On Wed, Nov 13, 2013 at 4:12 PM, Terry P. = <texpilot@gmail.com> wrote:
Revisiting t= his, looks like I found even when I have my secondary namenode hostname in = the gc file on the master and secondary namenode (my standby master), the g= c process would not start on any node but the master.

Looking at the config.sh script on lines 102-104:

if [ -f "$ACCUMU= LO_HOME/conf/gc" ]; then
=C2=A0=C2=A0=C2=A0 GC=3D`grep -v '^#&#= 39; "$ACCUMULO_HOME/conf/gc" | head -1`
fi


That only puts the first node found in the GC variable, similar to what is= done with MASTER.

Then in start-here.sh, whereas the masters file is checked to see if th= e current host is in the masters file (starting at line 38):

for host in $HOSTS
do
=C2= =A0=C2=A0=C2=A0 if grep -q "^${host}\$" $ACCUMULO_HOME/conf/ma= sters
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/accu= mulo org.apache.accumulo.server.master.state.SetGoalState NORMAL
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/start-server.sh $host master
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break
=C2=A0=C2=A0=C2=A0 fi
done

For GC, only the ${GC} variable is checked (starting on line= 48), and start of the gc is only attempted on the one host in the GC varia= ble:

for host in $HOSTSdo
=C2=A0=C2=A0=C2=A0 if [ = ${host} =3D ${GC} ]
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/start-server.sh $GC gc "garbage collector"= ;
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break
=C2=A0=C2=A0=C2=A0= fi
done


I've modified my start-here.sh script's GC section to<= /b>:

for host in $= HOSTS
do
=C2=A0=C2=A0=C2=A0 i= f grep -q "^${host}\$" $ACCUMULO_HOME/conf/gc
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin= }/start-server.sh ${host} gc "garbage collector"
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 break
=C2=A0=C2=A0=C2=A0 fi
done


And that d= id the trick.=C2=A0 I didn't see any changes needed in the stop-here.sh= script, as it already iterates over all possible processes to stop them if= they are found.

Can anyone think of anything I might have missed?
<= div>
I searched for any JIRAs on this but did not find any.=C2=A0=

I tried to download the 1.5.0 binaries, but download link doesn= 9;t work for me (has happened before with my work's overly restrictive = internet gateway) -- but FYI I asked my brother to download it, and the &qu= ot;mirrors" link under GENERIC BINARIES didn't work for his as wel= l.

If I can get my hands on the 1.5.0 binaries, I'll check = to see if this is already OBE in 1.5.0 today else it will have to wait unti= l tonight.


On Wed, Nov 6, 2013 at 12:12 PM, Terry P. <t= expilot@gmail.com> wrote:
Ahh thanks Billie, I'll= stick with just one monitor then.=C2=A0 Thanks!


On Wed, Nov 6= , 2013 at 10:32 AM, Billie Rinaldi <billie.rinaldi@gmail.com>= ; wrote:
On Tue, Nov 5, 2013 at= 12:51 PM, Josh Elser <josh.elser@gmail.com> wrote:
It looks like you can configure multiple hos= ts for GC and they'll use ZooKeeper to perform failover (like the maste= r).

Tracers -- You can run multiple tracer processes. You likely don't need= 1:1 as you run tservers, but you can run a few if you're concerned abo= ut it. They're not required for Accumulo operation.

Same for the monitor. If you need to have multiple running for failover pur= poses, it looks like you can specify multiple and it will just launch a mon= itor on each host you specified. There's no centralize URL you can alwa= ys hit here. You would have to check each one to find one that was, unless = you want to run some sort of reverse-proxy in front of them all.

I think additional monitors won'= t work entirely as expected.=C2=A0 The log forwarding from the other proces= ses is set up when the processes are started, and the logs are only sent to= the first host in the monitor file.


_obligatory I just looked to source code to come to this conclusion_

On 11/5/13, 3:13 PM, Terry P. wrote:
I put my secondary namenode in the masters file for the first time with
this latest Accumulo 1.4.2 cluster deployment so it would run as a
Standby Accumulo Master which I read about in the User Guide. =C2=A0Recentl= y
had the Master lose its Zookeeper lock (network glitch, being
researched), and was glad to see the secondary namenode Master process
took over as it should.

But if my Namenode / Accumulo Master server goes down, I also lose the
<= /div> gc, monitor, and tracer processes as well. *Can I configure the

secondary namenode in gc, tracer, and monitor files as well, or should
<= /div> they run on only one host at a time*? =C2=A0In this case, the GC also lost<= div>
its Zookeeper lock, which resulted in a cluster with no GC running at
all until I caught it.

Thanks in advance.





--001a11c3a5f0ef5f8a04eb17860e--