Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B087F3B7 for ; Wed, 27 Mar 2013 20:20:10 +0000 (UTC) Received: (qmail 79167 invoked by uid 500); 27 Mar 2013 20:20:10 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 79128 invoked by uid 500); 27 Mar 2013 20:20:10 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 79120 invoked by uid 99); 27 Mar 2013 20:20:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 20:20:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eric.newton@gmail.com designates 209.85.212.173 as permitted sender) Received: from [209.85.212.173] (HELO mail-wi0-f173.google.com) (209.85.212.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 20:20:03 +0000 Received: by mail-wi0-f173.google.com with SMTP id ez12so2696789wid.6 for ; Wed, 27 Mar 2013 13:19:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=tRQSulb/SjFdxxbI008+iRodoWNmFCAk+TSsgGt7Nao=; b=mycwa6km0uwKXwRBTV3KWSs4acQm6gF7d+bDmoTx26JzIxEsopBp9AZnEs54jqnnQd HZJJyPGutxszKOltzEnOaVzlwFhm1u0922cpfIaqHqOUHz10odcffdyWTb9lBP+rq1Ue U8IqfkjUAsP7XltDTaoFQm6S7ovdrFy5ay5XDvu67ZIxk/SR+ug1nKs+o1Ogymx+8Xib MprTsxGGiQQ+/iw9ZzJ4+SQJtwiSosqwvVgraHzhSb6UNAC8AzyE/qQcyTzxvwxNkACX 4IjKwitLdgZuIwdMChBGCKSgigdT9U9mjlwvFLPX4JDp7wSsW/DEb2tRcKAwKOwvQiOO qtfQ== MIME-Version: 1.0 X-Received: by 10.194.5.4 with SMTP id o4mr33816881wjo.40.1364415583400; Wed, 27 Mar 2013 13:19:43 -0700 (PDT) Received: by 10.217.107.138 with HTTP; Wed, 27 Mar 2013 13:19:43 -0700 (PDT) In-Reply-To: References: <51531536.90808@gmail.com> Date: Wed, 27 Mar 2013 16:19:43 -0400 Message-ID: Subject: Re: Waiting for accumulo to be initialized From: Eric Newton To: "user@accumulo.apache.org" Content-Type: multipart/alternative; boundary=047d7b5d88ed2cbbdf04d8edc21e X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5d88ed2cbbdf04d8edc21e Content-Type: text/plain; charset=ISO-8859-1 I should write this up in the user manual. It's not that hard, but it's really not the first thing you want to tackle while learning how to use accumulo. I just opened ACCUMULO-1217 to do that. I wrote this from memory: expect errors. Needless to say, you would only want to do this when you are more comfortable with hadoop, zookeeper and accumulo. First, get zookeeper up and running, even if you have delete all its data. Next, attempt to determine the mapping of table names to tableIds. You can do this in the shell when your accumulo instance is healthy. If it isn't healthy, you will have to guess based on the data in the files in HDFS. So, for example, the table "trace" is probably table id "1". You can find the files for trace in /accumulo/tables/1. Don't worry if you get the names wrong. You can always rename the tables later. Move the old files for accumulo out of the way and re-initialize: $ hadoop fs -mv /accumulo /accumulo-old $ ./bin/accumulo init $ ./bin/start-all.sh Recreate your tables: $ ./bin/accumulo shell -u root -p mysecret shell > createtable table1 Learn the new table id mapping: shell > tables -l !METADATA => !0 trace => 1 table1 => 2 ... Bulk import all your data back into the new table ids: Assuming you have determined that "table1" used to be table id "a" and is now "2", you do something like this: $ hadoop fs -mkdir /tmp/failed $ ./bin/accumulo shell -u root -p mysecret shell > table table1 shell table1 > importdirectory /accumulo-old/tables/a/default_tablet /tmp/failed true There are lots of directories under every table id directory. You will need to import each of them. I suggest creating a script and passing it to the shell on the command line. I know of instances in which trillions of entries were recovered and available in a matter of hours. -Eric On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis wrote: > when you say " you can move the files aside in HDFS" .. which files are > you referring to? I have never set up zookeeper myself so I am not aware of > all the changes needed. > > > > On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton wrote: > >> If you lose zookeeper, you can move the files aside in HDFS, recreate >> your instance in zookeeper and bulk import all of the old files. It's not >> perfect: you lose table configurations, split points and user permissions, >> but you do preserve most of the data. >> >> You can back up each of these bits of information periodically if you >> like. Outside of the files in HDFS, the configuration information is >> pretty small. >> >> -Eric >> >> >> >> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis wrote: >> >>> Eric and Josh thanks for all your feedback. We ended up *loosing all >>> our accumulo data* because I had to reformat hadoop. Here is in a >>> nutshell what I did: >>> >>> >>> 1. Stop accumulo >>> 2. Stop hadoop >>> 3. On hadoop master and all datanodes, from dfs.data.dir >>> (hdfs-site.xml) remove everything under the data folder >>> 4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove >>> everything under the name folder >>> 5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format >>> 6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should >>> populate data/ and name/ dirs that was erased in steps 3, 4. >>> 7. Initialized Accumulo - as accumulo user, >>> ../accumulo/bin/accumulo init (I created a new instance) >>> 8. Start accumulo >>> >>> I was wondering if anyone had suggestions or thoughts on how I could >>> have solved the original issue of accumulo waiting initialization without >>> loosing my accumulo data? Is it possible to do so? >>> >> >> > --047d7b5d88ed2cbbdf04d8edc21e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I should write this up in the user manual. =A0I= t's not that hard, but it's really not the first thing you want to = tackle while learning how to use accumulo. =A0I just opened=A0ACCUMULO-1217=A0to d= o that.

I wrote this from memory: expect errors. = =A0Needless to say, you would only want to do this when you are more comfor= table with hadoop, zookeeper and accumulo.=A0

First, get zookeeper up and running, even if you have delete all its data. = =A0

Next, attempt to determine the map= ping of table names to tableIds. =A0You can do this in the shell when your = accumulo instance is healthy. =A0If it isn't healthy, you will have to = guess based on the data in the files in HDFS.

So, for example, the table "trace"= ; is probably table id "1". =A0You can find the files for trace i= n /accumulo/tables/1.

Don't worry = if you get the names wrong. =A0You can always rename the tables later.=A0

Move the old files for accumulo out of the = way and re-initialize:

$ hadoop fs -mv /accumulo = /accumulo-old
$ ./bin/accumulo init
$ ./bin/start= -all.sh

Recreate your tables:

<= /div>
$ ./bin/accumulo shell -u root -p mysecret
= shell > createtable table1

Learn th= e new table id mapping:
shell > tables -l
!METADATA =3D> !0
<= div style>trace =3D> 1
table1 =3D> 2
= ...

Bulk import all your data back int= o the new table ids:
Assuming you have determined that "table1" used to be = table id "a" and is now "2",
you do som= ething like this:

$ hadoop fs -mkdir /= tmp/failed
$ ./bin/accumulo shell -u root -p mysecret
shell= > table table1
shell table1 > importdirectory /accum= ulo-old/tables/a/default_tablet /tmp/failed true

There are lots of directories under every table id directory. = =A0You will need to import each of them. =A0I suggest creating a script and= passing it to the shell on the command line.

I know of instances in which trillions of entries were recovered and availa= ble in a matter of hours.

-Eric
<= div style>


On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <aji1705@gmail.com> wrote:
when you say "=A0you can move the files aside in = HDFS" .. which files are you referring to? I have never set up zookeep= er myself so I am not aware of all the changes needed.

=

On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <eric.newton@gmail.com> wrote:
If you lose zookeeper, you can move the files aside in HDF= S, recreate your instance in zookeeper and bulk import all of the old files= . =A0It's not perfect: you lose table configurations, split points and = user permissions, but you do preserve most of the data.

You can back up each of these bits of information periodical= ly if you like. =A0Outside of the files in HDFS, the configuration informat= ion is pretty small.

-Eric


On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis = <aji1705@gmail.com> wrote:
Eric and Josh thanks for all your feedback. = We ended up loosing all our accumulo data because I had to reformat = hadoop. Here is in a nutshell what I did:

  1. Stop accumulo=A0
  2. Stop hadoop
  3. On hadoop master and all datanodes, from=A0dfs.data.dir (hdfs-site.xml)= remove everything under the data folder
  4. On hadoop master, from=A0d= fs.name.dir (hdfs-site.xml) remove everything under the name folder
  5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format
  6. As ha= doop user, execute.../hadoop/bin/start-all.sh =3D=3D> should populate da= ta/ and name/ dirs that was erased in steps 3, 4.
  7. Initialized Accum= ulo - as accumulo user, =A0../accumulo/bin/accumulo init (I created a new i= nstance)
  8. Start accumulo
I was wondering if anyone had sugges= tions or thoughts on how I could have solved the original issue of accumulo= waiting initialization without loosing my accumulo data? Is it possible to= do so?



--047d7b5d88ed2cbbdf04d8edc21e--