Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C8A71012F for ; Wed, 13 Nov 2013 22:29:48 +0000 (UTC) Received: (qmail 64666 invoked by uid 500); 13 Nov 2013 22:29:47 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 64587 invoked by uid 500); 13 Nov 2013 22:29:47 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 64579 invoked by uid 99); 13 Nov 2013 22:29:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Nov 2013 22:29:47 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of texpilot@gmail.com designates 209.85.214.51 as permitted sender) Received: from [209.85.214.51] (HELO mail-bk0-f51.google.com) (209.85.214.51) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Nov 2013 22:29:42 +0000 Received: by mail-bk0-f51.google.com with SMTP id my12so641487bkb.24 for ; Wed, 13 Nov 2013 14:29:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=IbOnLm7NowFWy7ZMUzwfTLf2I8Nfc/oNFEkSBAR1wNw=; b=zF3ExpgzitL/+TFWXEgKPGcqZNQVyEM6RPgqc1+Spy5BJQunG+QteDniA2zhRa8Rsq luBWlyqQEDAZfk/oQlRTeTaoTM0wY5m++eHCPn0KU6pFgxOfd5s/uZiMo4eDI7R3/Esb vOipNP3gQx7ZeDG/CbuednySsK5miWf5T96uWK/aZzUe8bfBbS4FtgBSG/PUxc9KTbjz t90xfyncAmudtG60a3z8vcXQPnPvxPoR9UTJeYY8vW0NzegpU7olEkK4uwnowG5wukD1 prd4BXznKwBsFaKDmdJWL9Wzy8EOYqa/R/aTQZI6W/cYn023vW9crua+l57Y0CSYqgm4 anXQ== MIME-Version: 1.0 X-Received: by 10.205.78.5 with SMTP id zk5mr9236bkb.25.1384381761319; Wed, 13 Nov 2013 14:29:21 -0800 (PST) Received: by 10.205.39.197 with HTTP; Wed, 13 Nov 2013 14:29:21 -0800 (PST) In-Reply-To: References: <52795A6C.40800@gmail.com> Date: Wed, 13 Nov 2013 16:29:21 -0600 Message-ID: Subject: Re: Accumulo Standby Master question From: "Terry P." To: "user@accumulo.apache.org" Content-Type: multipart/alternative; boundary=f46d041038a71e1eaf04eb167fab X-Virus-Checked: Checked by ClamAV on apache.org --f46d041038a71e1eaf04eb167fab Content-Type: text/plain; charset=UTF-8 I was able to get to the 1.5.0 source, and start-here.sh has the same problem as described above for 1.4.2. On Wed, Nov 13, 2013 at 4:12 PM, Terry P. wrote: > Revisiting this, looks like I found even when I have my secondary namenode > hostname in the gc file on the master and secondary namenode (my standby > master), the gc process would not start on any node but the master. > > Looking at the config.sh script on lines 102-104: > > if [ -f "$ACCUMULO_HOME/conf/gc" ]; then > GC=`grep -v '^#' "$ACCUMULO_HOME/conf/gc" | head -1` > fi > > That only puts the first node found in the GC variable, similar to what is > done with MASTER. > > Then in start-here.sh, whereas the masters file is checked to see if the > current host is in the masters file (starting at line 38): > > for host in $HOSTS > do > *if grep -q "^${host}\$" $ACCUMULO_HOME/conf/masters* > then > ${bin}/accumulo > org.apache.accumulo.server.master.state.SetGoalState NORMAL > ${bin}/start-server.sh $host master > break > fi > done > > *For GC, only the ${GC} variable is checked (starting on line 48), and > start of the gc is only attempted on the one host in the GC variable*: > > for host in $HOSTS > do > *if [ ${host} = ${GC} ]* > then > ${bin}/start-server.sh *$GC* gc "garbage collector" > break > fi > done > > *I've modified my start-here.sh script's GC section to*: > > for host in $HOSTS > do > *if grep -q "^${host}\$" $ACCUMULO_HOME/conf/gc* > then > ${bin}/start-server.sh *${host}* gc "garbage collector" > break > fi > done > > And that did the trick. I didn't see any changes needed in the > stop-here.sh script, as it already iterates over all possible processes to > stop them if they are found. > > Can anyone think of anything I might have missed? > > I searched for any JIRAs on this but did not find any. > > I tried to download the 1.5.0 binaries, but download link doesn't work for > me (has happened before with my work's overly restrictive internet gateway) > -- but FYI I asked my brother to download it, and the "mirrors" link under > GENERIC BINARIES didn't work for his as well. > > If I can get my hands on the 1.5.0 binaries, I'll check to see if this is > already OBE in 1.5.0 today else it will have to wait until tonight. > > > > On Wed, Nov 6, 2013 at 12:12 PM, Terry P. wrote: > >> Ahh thanks Billie, I'll stick with just one monitor then. Thanks! >> >> >> On Wed, Nov 6, 2013 at 10:32 AM, Billie Rinaldi > > wrote: >> >>> On Tue, Nov 5, 2013 at 12:51 PM, Josh Elser wrote: >>> >>>> It looks like you can configure multiple hosts for GC and they'll use >>>> ZooKeeper to perform failover (like the master). >>>> >>>> Tracers -- You can run multiple tracer processes. You likely don't need >>>> 1:1 as you run tservers, but you can run a few if you're concerned about >>>> it. They're not required for Accumulo operation. >>>> >>>> Same for the monitor. If you need to have multiple running for failover >>>> purposes, it looks like you can specify multiple and it will just launch a >>>> monitor on each host you specified. There's no centralize URL you can >>>> always hit here. You would have to check each one to find one that was, >>>> unless you want to run some sort of reverse-proxy in front of them all. >>>> >>> >>> I think additional monitors won't work entirely as expected. The log >>> forwarding from the other processes is set up when the processes are >>> started, and the logs are only sent to the first host in the monitor file. >>> >>> >>>> _obligatory I just looked to source code to come to this conclusion_ >>>> >>>> >>>> On 11/5/13, 3:13 PM, Terry P. wrote: >>>> >>>>> I put my secondary namenode in the masters file for the first time with >>>>> this latest Accumulo 1.4.2 cluster deployment so it would run as a >>>>> Standby Accumulo Master which I read about in the User Guide. Recently >>>>> had the Master lose its Zookeeper lock (network glitch, being >>>>> researched), and was glad to see the secondary namenode Master process >>>>> took over as it should. >>>>> >>>>> But if my Namenode / Accumulo Master server goes down, I also lose the >>>>> gc, monitor, and tracer processes as well. *Can I configure the >>>>> >>>>> secondary namenode in gc, tracer, and monitor files as well, or should >>>>> they run on only one host at a time*? In this case, the GC also lost >>>>> >>>>> its Zookeeper lock, which resulted in a cluster with no GC running at >>>>> all until I caught it. >>>>> >>>>> Thanks in advance. >>>>> >>>> >>> >> > --f46d041038a71e1eaf04eb167fab Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I was able to get to the 1.5.0 source, and start-here.sh h= as the same problem as described above for 1.4.2.


On Wed, Nov 13, 2013 at 4:12 = PM, Terry P. <texpilot@gmail.com> wrote:
Revisiting t= his, looks like I found even when I have my secondary namenode hostname in = the gc file on the master and secondary namenode (my standby master), the g= c process would not start on any node but the master.

Looking at the config.sh script on lines 102-104:

if [ -f "$ACCUMU= LO_HOME/conf/gc" ]; then
=C2=A0=C2=A0=C2=A0 GC=3D`grep -v '^#&#= 39; "$ACCUMULO_HOME/conf/gc" | head -1`
fi


That only puts the first node found in the GC variable, similar to what is= done with MASTER.

Then in start-here.sh, whereas the masters file is checked to see if th= e current host is in the masters file (starting at line 38):

for host in $HOSTS
do
=C2= =A0=C2=A0=C2=A0 if grep -q "^${host}\$" $ACCUMULO_HOME/conf/ma= sters
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/accu= mulo org.apache.accumulo.server.master.state.SetGoalState NORMAL
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/start-server.sh $host master
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break
=C2=A0=C2=A0=C2=A0 fi
done

For GC, only the ${GC} variable is checked (starting on line= 48), and start of the gc is only attempted on the one host in the GC varia= ble:

for host in $HOSTSdo
=C2=A0=C2=A0=C2=A0 if [ ${host} =3D ${GC} ]
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin}/start-server.sh $GC gc "garbage collector"= ;
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break
=C2=A0=C2=A0=C2=A0= fi
done


I've modified my start-here.sh script's GC section to<= /b>:

for host in $= HOSTS
do
=C2=A0=C2=A0=C2=A0 i= f grep -q "^${host}\$" $ACCUMULO_HOME/conf/gc
=C2=A0=C2=A0=C2=A0 then
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ${bin= }/start-server.sh ${host} gc "garbage collector"
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 break
=C2=A0=C2=A0=C2=A0 fi
done


And that d= id the trick.=C2=A0 I didn't see any changes needed in the stop-here.sh= script, as it already iterates over all possible processes to stop them if= they are found.

Can anyone think of anything I might have missed?
<= div>
I searched for any JIRAs on this but did not find any.=C2=A0=

I tried to download the 1.5.0 binaries, but download link doesn= 9;t work for me (has happened before with my work's overly restrictive = internet gateway) -- but FYI I asked my brother to download it, and the &qu= ot;mirrors" link under GENERIC BINARIES didn't work for his as wel= l.

If I can get my hands on the 1.5.0 binaries, I'll check = to see if this is already OBE in 1.5.0 today else it will have to wait unti= l tonight.



On Wed, Nov 6, 2= 013 at 12:12 PM, Terry P. <texpilot@gmail.com> wrote:
Ahh thanks Billie, I'll= stick with just one monitor then.=C2=A0 Thanks!


On Wed, Nov 6= , 2013 at 10:32 AM, Billie Rinaldi <billie.rinaldi@gmail.com>= ; wrote:
On Tue, Nov 5, 2013 at= 12:51 PM, Josh Elser <josh.elser@gmail.com> wrote:
It looks like you can configure multiple hos= ts for GC and they'll use ZooKeeper to perform failover (like the maste= r).

Tracers -- You can run multiple tracer processes. You likely don't need= 1:1 as you run tservers, but you can run a few if you're concerned abo= ut it. They're not required for Accumulo operation.

Same for the monitor. If you need to have multiple running for failover pur= poses, it looks like you can specify multiple and it will just launch a mon= itor on each host you specified. There's no centralize URL you can alwa= ys hit here. You would have to check each one to find one that was, unless = you want to run some sort of reverse-proxy in front of them all.

I think additional monitors won'= t work entirely as expected.=C2=A0 The log forwarding from the other proces= ses is set up when the processes are started, and the logs are only sent to= the first host in the monitor file.


_obligatory I just looked to source code to come to this conclusion_

On 11/5/13, 3:13 PM, Terry P. wrote:
I put my secondary namenode in the masters file for the first time with
this latest Accumulo 1.4.2 cluster deployment so it would run as a
Standby Accumulo Master which I read about in the User Guide. =C2=A0Recentl= y
had the Master lose its Zookeeper lock (network glitch, being
researched), and was glad to see the secondary namenode Master process
took over as it should.

But if my Namenode / Accumulo Master server goes down, I also lose the
<= /div> gc, monitor, and tracer processes as well. *Can I configure the

secondary namenode in gc, tracer, and monitor files as well, or should
<= /div> they run on only one host at a time*? =C2=A0In this case, the GC also lost<= div>
its Zookeeper lock, which resulted in a cluster with no GC running at
all until I caught it.

Thanks in advance.




--f46d041038a71e1eaf04eb167fab--