Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 397CB1051A for ; Wed, 9 Oct 2013 00:49:26 +0000 (UTC) Received: (qmail 7646 invoked by uid 500); 9 Oct 2013 00:49:26 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 7602 invoked by uid 500); 9 Oct 2013 00:49:26 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 7594 invoked by uid 99); 9 Oct 2013 00:49:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 00:49:26 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of texpilot@gmail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-ee0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 00:49:19 +0000 Received: by mail-ee0-f48.google.com with SMTP id l10so32083eei.35 for ; Tue, 08 Oct 2013 17:48:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=P0E7EuowNcPGtPXR2K0lTkMQHtACI/8L8UoCI3RSjcs=; b=S0wcTIrLOYsl6I2gvoS908348RIpecqKaMTFN1LJxLxDsUNcxZksuxPZ0Zs1tcsMLa RkzQveaffamzRjfTl9W7KQdrp4TEMiLg+zohQ/PXd5PL4WNVn9x9fsHJmGfv9ijNF7z3 1J+C5OuLPFIm6Q6XWSwqC1jw/EEW53ZT24DsVGb9QpjIn1DpGYU+Nbujl6GqENdufxv0 TopDNTpNFBAT+OAYA32eaC2KNXr/FdaxEK7SsTiKJPtYevZ0oA1IB7fD1X80AtLi9d6A r+qgMutzFGxqGp5fuxU6WVYjOcK2EdcTpz8EslPUHUtFcGrkZs3qIZx3Fz5Y+uUdGmO9 JeRA== MIME-Version: 1.0 X-Received: by 10.15.75.73 with SMTP id k49mr6764922eey.36.1381279739796; Tue, 08 Oct 2013 17:48:59 -0700 (PDT) Received: by 10.223.75.196 with HTTP; Tue, 8 Oct 2013 17:48:59 -0700 (PDT) In-Reply-To: References: Date: Tue, 8 Oct 2013 19:48:59 -0500 Message-ID: Subject: Re: Accumulo init over existing instance From: "Terry P." To: "user@accumulo.apache.org" Content-Type: multipart/alternative; boundary=001a11c1af4c3a024a04e84440ac X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1af4c3a024a04e84440ac Content-Type: text/plain; charset=UTF-8 Thanks Keith, great information. We're just entering formal test though so 1.5 isn't an option with this project. But great to know that the move to HDFS with the walogs at least helped this issue significantly it looks like. Thanks again. On Tue, Oct 8, 2013 at 7:37 PM, Keith Turner wrote: > > > > On Tue, Oct 8, 2013 at 7:50 PM, Terry P. wrote: > >> Thanks Jared. >> >> John, thanks for the warning! I lost a dev cluster once when we had to >> re-IP the Accumulo servers, but reverse DNS wasn't configured and I assumed >> that was why. Guess that wasn't. >> >> Keith, I read through ACCUMULO-1585 but it wasn't completely clear if the >> change proposed would also allow a server or servers in a cluster to have >> its IP address changed. I hope it will, as while having to re-IP a server >> or cluster is fairly rare, it certainly happens (as it did in our case). >> > > I think moving from 1.4 to 1.5 will help. In 1.4 Accumulo has logger > servers that store write ahead logs/edit logs. Data stored on these > loggers are needed when a tablet server crashes. Accumulo stores pointers > to loggers using IP addresses. So if the IP address of the machine running > a logger changes, then Accumulo can no longer find the data need to recover > from a fault. > > Starting w/ 1.5 Accumulo started storing write ahead logs in HDFS and the > pointers to these WAlogs are now hdfs paths. The IP addrs that are still > stored in 1.5 in zookeeper and the metadata table are more transient. For > example locations of tablets are stored in the metadata table using IP > addrs. If a tablet server dies and restarts w/ a different IP addr its > probably ok, because the tablet will just be reassigned to a different > tablet server. You may lose some locality because Accumulo prefers to > assign a tablet to the last place it compacted data, but things should > still work. > > I have not tried changing IP addrs w/ a 1.5 instance, so I do not know if > there are other problems. But I do know that the walogs were a problem in > 1.4 and that should no longer be a problem in 1.5. > > >> >> Thanks all, >> Terry >> >> >> >> On Tue, Oct 8, 2013 at 5:14 PM, Keith Turner wrote: >> >>> >>> >>> >>> On Tue, Oct 8, 2013 at 6:07 PM, John Vines wrote: >>> >>>> Like Jared said, wiping /accumulo out of hdfs is all you need to do. >>>> >>>> But Accumulo still uses IP addresses internally, so I'm not quite >>>> certain you're going to achieve what you set out for. >>>> >>> >>> Until 1.6.0 w/ ACCUMULO-1585 >>> >>> >>>> >>>> >>>> On Tue, Oct 8, 2013 at 5:32 PM, Terry P. wrote: >>>> >>>>> So reverse DNS wasn't working when I deployed my new cluster, thus all >>>>> my Tablet Servers were showing up in the Monitor as IP addresses (even >>>>> though all configuration files had hostnames only). Lesson learned: trust, >>>>> but verify (and ensure your hardened base servers still have nslookup >>>>> and/or dig on them). >>>>> >>>>> Now that DNS is fixed, I want to wipe everything clean and re-init >>>>> Accumulo to ensure everything is legit using hostnames to ensure the >>>>> cluster is not tied to IP addresses. >>>>> >>>>> I know I need to do a new 'accumulo init' -- I'll pass in the same >>>>> instance name, and my understanding is that will overwrite everything >>>>> currently in Zookeeper. >>>>> >>>>> My question is: is there anything else I could/should do first to >>>>> "clean up" from this botched instance? E.g. should I delete all files in >>>>> HDFS, the write-ahead logs on the Tablet Servers, etc.? I'm running >>>>> Accumulo 1.4.2. >>>>> >>>>> Thanks, >>>>> Terry >>>>> >>>> >>>> >>> >> > --001a11c1af4c3a024a04e84440ac Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Keith, great information.=C2=A0 We're just= entering formal test though so 1.5 isn't an option with this project.= =C2=A0 But great to know that the move to HDFS with the walogs at least hel= ped this issue significantly it looks like.

Thanks again.

=
On Tue, Oct 8, 2013 at 7:37 PM, Keith Turner= <keith@deenlo.com> wrote:



On Tue, Oct 8, 201= 3 at 7:50 PM, Terry P. <texpilot@gmail.com> wrote:
Thanks Jared= .

John, thanks for the warning!=C2=A0 I lost a dev cluster onc= e when we had to re-IP the Accumulo servers, but reverse DNS wasn't con= figured and I assumed that was why.=C2=A0 Guess that wasn't.

Keith, I read through ACCUMULO-1585 but it wasn't completely = clear if the change proposed would also allow a server or servers in a clus= ter to have its IP address changed.=C2=A0 I hope it will, as while having t= o re-IP a server or cluster is fairly rare, it certainly happens (as it did= in our case).

I think moving from 1.4 = to 1.5 will help. =C2=A0In 1.4 Accumulo has logger servers that store write= ahead logs/edit logs. =C2=A0Data stored on these loggers are needed when a= tablet server crashes. =C2=A0Accumulo stores pointers to loggers using IP = addresses. =C2=A0So if the IP address of the machine running a logger chang= es, then Accumulo can no longer find the data need to recover from a fault.= =C2=A0

Starting w/ 1.5 Accumulo started storing write ahead lo= gs in HDFS and the pointers to these WAlogs are now hdfs paths. =C2=A0 The = IP addrs that are still stored in 1.5 in zookeeper and the metadata table a= re more transient. =C2=A0For example locations of tablets are stored in the= metadata table using IP addrs. =C2=A0If a tablet server dies and restarts = w/ a different IP addr its probably ok, because the tablet will just be rea= ssigned to a different tablet server. =C2=A0You may lose some locality beca= use Accumulo prefers to assign a tablet to the last place it compacted data= , but things should still work. =C2=A0 =C2=A0

I have not tried changing IP addrs w/ a 1.5 instance, s= o I do not know if there are other problems. =C2=A0But I do know that the w= alogs were a problem in 1.4 and that should no longer be a problem in 1.5.<= /div>
=C2=A0

Thanks all,
Terry



On Tue, Oct 8, 2013 at 5:14 PM, Ke= ith Turner <keith@deenlo.com> wrote:



On Tue, Oct 8, 2013 at 6:07 PM,= John Vines <vines@apache.org> wrote:
Like Jared said, wiping /accumulo out of = hdfs is all you need to do.

But Accumulo still uses IP addresses internally, so I'm = not quite certain you're going to achieve what you set out for.

Until 1.6.0 w/ ACCUMULO-1585=C2= =A0
=C2=A0


On= Tue, Oct 8, 2013 at 5:32 PM, Terry P. <texpilot@gmail.com>= wrote:
So reverse DNS wasn't working when I deployed my new cluster, thus all = my Tablet Servers were showing up in the Monitor as IP addresses (even thou= gh all configuration files had hostnames only).=C2=A0 Lesson learned: trust= , but verify (and ensure your hardened base servers still have nslookup and= /or dig on them).

Now that DNS is fixed, I want to wipe everything clean and re-init Accu= mulo to ensure everything is legit using hostnames to ensure the cluster is= not tied to IP addresses.

I know I need to do a new 'accu= mulo init' -- I'll pass in the same instance name, and my understan= ding is that will overwrite everything currently in Zookeeper.

My question is: is there anything else I could/should do first to= "clean up" from this botched instance?=C2=A0 E.g. should I delet= e all files in HDFS, the write-ahead logs on the Tablet Servers, etc.?=C2= =A0 I'm running Accumulo 1.4.2.

Thanks,
Terry





--001a11c1af4c3a024a04e84440ac--