Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9B1B2CE2F for ; Fri, 19 Jul 2013 19:15:31 +0000 (UTC) Received: (qmail 98590 invoked by uid 500); 19 Jul 2013 19:15:30 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 98409 invoked by uid 500); 19 Jul 2013 19:15:30 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 98388 invoked by uid 99); 19 Jul 2013 19:15:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jul 2013 19:15:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of evolvah@gmail.com designates 209.85.216.51 as permitted sender) Received: from [209.85.216.51] (HELO mail-qa0-f51.google.com) (209.85.216.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jul 2013 19:15:21 +0000 Received: by mail-qa0-f51.google.com with SMTP id f14so88479qak.10 for ; Fri, 19 Jul 2013 12:15:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NItRc1g8L8r7D1st8uwBggqlF0UmEhnN5a3usOC4kSw=; b=pAdbd3a+PCw1n0Ft5oLAL3Y7/9YLJy8Jkq77AWhstNZbWhj6N9uASGcnagt7Tj7A6U ptBrFp8Su/lN/AN7494mVs6Z9hRmc1CenFcGqOZrefBFQH4lGrWYytChp49UsjN38Mg9 vFDgzRZsTpNfd1JelcgZJu7EZwcXayjBPL9ZaMJMWDA8Km+2hHSFX6nwtfn4++Gkiahy N+Pmslq7/a99nK/8DvJcu5b5ZR80K2I/bhA59SegjZL0slyUyce/QzIkasA0SYHeHy0g VB/Puc/F4vQ9R9eyjt0a23pJ8XAeeOb/N88WebX/RL0mgdck1MdPi8JXHUIKPUSfqqKI g6tA== MIME-Version: 1.0 X-Received: by 10.224.123.68 with SMTP id o4mr19799374qar.106.1374261300372; Fri, 19 Jul 2013 12:15:00 -0700 (PDT) Received: by 10.49.57.163 with HTTP; Fri, 19 Jul 2013 12:15:00 -0700 (PDT) In-Reply-To: References: Date: Fri, 19 Jul 2013 14:15:00 -0500 Message-ID: Subject: Re: Zookeeper ensemble backup questions? From: Sergey Maslyakov To: user@zookeeper.apache.org Content-Type: multipart/mixed; boundary=047d7bd6bc92a3492c04e1e2244b X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd6bc92a3492c04e1e2244b Content-Type: multipart/alternative; boundary=047d7bd6bc92a3492904e1e22449 --047d7bd6bc92a3492904e1e22449 Content-Type: text/plain; charset=ISO-8859-1 I can share this patch based on 3.4.5, which does thee trick. It adds a "snps" 4lw command that accepts one mandatory argument, which is an absolute path for the direcotry where the snapshot file will be dropped. The "absoluteness" of the path s verified by UNIX rules. Not sure how it would work in Windows, though. The target directory must exist and be writeable by the effective UID of Zookeeper server. If the operation was successful, Zookeeper server responds back with the absolute path of the snapshot file. You can watch for the '/' character to trigger your reaction to the response. In my case, a 700MB snapshot takes about 30 seconds to write out. Please see several examples below: ~ $ mkdir /tmp/snapshot-test ~ $ telnet localhost 12181 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. snps /tmp/snapshot-test /tmp/snapshot-test/snapshot.316c8 Connection to localhost closed by foreign host. ~ $ ls -al /tmp/snapshot-test/snapshot.316c8 -rw-r--r-- 1 srvr srvr 719602373 Jul 19 14:09 /tmp/snapshot-test/snapshot.316c8 ~ $ telnet localhost 12181 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. snps blah Snapshot directory path must be absoulte, i.e., it must start with '/'. Path "blah" does not meet the criteria. Connection to localhost closed by foreign host. ~ $ telnet localhost 12181 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. snps /tmp/blah Error while serializing snapshot into /tmp/blah/snapshot.316c8. /tmp/blah/snapshot.316c8 (No such file or directory) Connection to localhost closed by foreign host. ~ $ telnet localhost 12181 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. snps Snapshot directory path must be absoulte, i.e., it must start with '/'. Path "" does not meet the criteria. Connection to localhost closed by foreign host. ~ $ On Fri, Jul 19, 2013 at 1:42 PM, jack ma wrote: > Thanks Sergei. > > That is great improvement idea for the zookeeper. I think that zookeeper is > planning to add a new 4lrt command "snap", but it is not ready yet. > > My original questions is based on the current version of zookeeper (3.4.5), > do you have any answers for them? > > Appreciate for the help. > > thanks > Jack > > > > > On Fri, Jul 19, 2013 at 11:19 AM, Sergey Maslyakov >wrote: > > > Jack, > > > > Here is how I see the backup process happening. > > > > 1. Zookeeper server can be changed to support a new 4lw that will write > out > > the current state of the DataTree into a snapshot file with the path and > > name provided as an argument to this new command (barring all the > > permissions, disk space, and other system-level restrictions). Probably, > I > > would ask Zookeeper to save the snapshot in a directory outside of the > > standard "dataLog" for the sake of cleanliness. > > > > 2. When Zookeeper server responds to the new "snapshot" command with > > success indication, the requesting process knows that the file has been > > written out and it can go and process it. It can add some metadata and > > create an archive to store it somewhere, for example. Alternatively, > > Zookeeper server could stream the data it would have written into a > > snapshot as the response to the new "snapshot" command. This way, the > > client becomes responsible for persistence and this lifts a number of > > permission-related issues (but raises some other issues too). Oh, and by > > the way, it looks like snapshot files are rather compressible. I did see > > the factor of 20 and more on the data that I have. > > > > 3. Disk cleanups are performed. > > > > With this backup procedure the restore would turn into: > > > > 1. Stopping all ensemble mebers > > > > 2. Wiping out dataDir/version-2 and dataLogDir/version-2 > > > > 3. Restoring the snapshot taken by the above backup procedure on one of > the > > servers into dataDir/version-2 > > > > 4. Bringing this server online > > > > 5. Allowing some time for it to load the snapshot. You could send "isro" > > 4lw command to it to see when it stops responding with "null". When the > > response becomes "ro" or "rw", this is when it is ready to populate > others > > with its own data > > > > 6. Bring up other servers one-by-one, to allow them form a quorum with > the > > populated server > > > > > > Hope, this helps! I'd be glad to hear from people who know the internals > of > > Zookeeper server better whether this approach is flawed or robust. > > > > > > Regards, > > /Sergey > > > > > > On Fri, Jul 19, 2013 at 1:00 PM, jack ma wrote: > > > > > I asked those question in the thread > > > > > > > > > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201307.mbox/%3cCAB+cfdwhOV0JfB04=MpO_+i-4ou=VbL=EG2XS557+j+698jx3A@mail.gmail.com%3e > > > , > > > but there is no response for that. > > > > > > So I posted those questions again here, hopefully I could get helps > > > from the community. > > > > > > I want to make sure I am fully understanding the procedures of > zookeeper > > > backup and disaster recovery: > > > > > > For the backup procedures at zookeeper assemble: > > > (1) Login to any host which state is "Serving" > > > Question: > > > Do I have to login to leader node, or any node is ok? > > > (2) Copy latest snapshot file and transaction log from version-2 > > directory. > > > Question: > > > How to make sure we do not copy corrupt files if the > > > snapshot/transaction log is in the middle of update? Do we have to > > shutdown > > > the node to make the copy? > > > besides the transaction log and snapshot, do we have > to > > > copy other files such as the ecoch files > > > > > > For the disaster recovery procedures at zookeeper assemble: > > > (1) recreate the machines for the zookeeper ensemble > > > (2) copy snapshot/transaction log we backed up into the zookeeper > > > dataDir\version-2 and logDir\version2. > > > Question: > > > Do we have to copy the epoch files? > > > Do we have to copy snapshot/transaction log backed up > to > > > all the zookeeper node, or just the first node we starts? > > > > > > Appreciate your time and help. > > > Jack > > > > > > --047d7bd6bc92a3492904e1e22449 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I can share this patch based on 3.4.5, which does thee tri= ck.

It adds a "snps" 4lw command that accepts = one mandatory argument, which is an absolute path for the direcotry where t= he snapshot file will be dropped. The "absoluteness" of the path = s verified by UNIX rules. Not sure how it would work in Windows, though. Th= e target directory must exist and be writeable by the effective UID of Zook= eeper server.

If the operation was successful, Zookeeper server respo= nds back with the absolute path of the snapshot file. You can watch for the= '/' character to trigger your reaction to the response.

In my case, a 700MB snapshot takes about 30 seconds to write= out.

Please see several examples below:

~ $ mkdir /tmp/s= napshot-test

~ $ telnet localhost 12181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'= .
snps /tmp/snapshot= -test
/tmp/snapshot-test/snapshot.316c8
Connection to localhost closed b= y foreign host.

=
~ $ ls -al /tmp/sna= pshot-test/snapshot.316c8
-rw-r--r-- =A0 1 srvr =A0 =A0 sr= vr =A0 =A0 719602373 Jul 19 14:09 /tmp/snapshot-test/snapshot.316c8<= /div>

~ $ telnet localhost 12181
Trying 127.0.0.1...
=
Connected to localhost.
Escape character is '^]&= #39;.
snps blah =A0 =A0
Snapshot directory path must be ab= soulte, i.e., it must start with '/'. Path "blah" does no= t meet the criteria.
Connection to localhost closed b= y foreign host.

=
~ $ telnet localhos= t 12181
Trying 127.0.0.1...
=
Connected to localhost.
Escape character is '^]&= #39;.
snps /tmp/blah
= Error while serializing snapshot into= /tmp/blah/snapshot.316c8. /tmp/blah/snapshot.316c8 (No such file or direct= ory)
Connection to localhost closed b= y foreign host.

=
~ $ telnet localhos= t 12181
Trying 127.0.0.1...
=
Connected to localhost.
Escape character is '^]&= #39;.
snps
Snapshot directory path must be absoulte, i.e.,= it must start with '/'. Path "" does not meet the criter= ia.
Connection to localhost closed b= y foreign host.

=
~ $=A0
=




On Fri, Jul 19, 2013 at 1:42 PM, jack ma <jack= ma1402@gmail.com> wrote:
Thanks Sergei.

That is great improvement idea for the zookeeper. I think that zookeeper is=
planning to add a new 4lrt command "snap", but it is not ready ye= t.

My original questions is based on the current version of zookeeper (3.4.5),=
do you have any answers for them?

Appreciate for the help.

thanks
Jack




On Fri, Jul 19, 2013 at 11:19 AM, Sergey Maslyakov <evolvah@gmail.com>wrote:

> Jack,
>
> Here is how I see the backup process happening.
>
> 1. Zookeeper server can be changed to support a new 4lw that will writ= e out
> the current state of the DataTree into a snapshot file with the path a= nd
> name provided as an argument to this new command (barring all the
> permissions, disk space, and other system-level restrictions). Probabl= y, I
> would ask Zookeeper to save the snapshot in a directory outside of the=
> standard "dataLog" for the sake of cleanliness.
>
> 2. When Zookeeper server responds to the new "snapshot" comm= and with
> success indication, the requesting process knows that the file has bee= n
> written out and it can go and process it. It can add some metadata and=
> create an archive to store it somewhere, for example. Alternatively, > Zookeeper server could stream the data it would have written into a > snapshot as the response to the new "snapshot" command. This= way, the
> client becomes responsible for persistence and this lifts a number of<= br> > permission-related issues (but raises some other issues too). Oh, and = by
> the way, it looks like snapshot files are rather compressible. I did s= ee
> the factor of 20 and more on the data that I have.
>
> 3. Disk cleanups are performed.
>
> With this backup procedure the restore would turn into:
>
> 1. Stopping all ensemble mebers
>
> 2. Wiping out dataDir/version-2 and dataLogDir/version-2
>
> 3. Restoring the snapshot taken by the above backup procedure on one o= f the
> servers into dataDir/version-2
>
> 4. Bringing this server online
>
> 5. Allowing some time for it to load the snapshot. You could send &quo= t;isro"
> 4lw command to it to see when it stops responding with "null"= ;. When the
> response becomes "ro" or "rw", this is when it is = ready to populate others
> with its own data
>
> 6. Bring up other servers one-by-one, to allow them form a quorum with= the
> populated server
>
>
> Hope, this helps! I'd be glad to hear from people who know the int= ernals of
> Zookeeper server better whether this approach is flawed or robust.
>
>
> Regards,
> /Sergey
>
>
> On Fri, Jul 19, 2013 at 1:00 PM, jack ma <jackma1402@gmail.com> wrote:
>
> > I asked those question in the thread
> >
> >
> http://mail-archives.apache.org/mod_mbox/zo= okeeper-user/201307.mbox/%3cCAB+cfdwhOV0JfB04=3DMpO_+i-4ou=3DVbL=3DEG2XS557= +j+698jx3A@mail.gmail.com%3e
> > ,
> > but there is no response for that.
> >
> > So I posted those questions again here, hopefully I could get hel= ps
> > from the community.
> >
> > I want to make sure I am fully understanding the procedures of zo= okeeper
> > backup and disaster recovery:
> >
> > For the backup procedures at zookeeper assemble:
> > (1) Login to any host which state is "Serving"
> > =A0 =A0 =A0 =A0 =A0 =A0Question:
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Do I have to login to leader = node, or any node is ok?
> > (2) Copy latest snapshot file and transaction log from version-2<= br> > directory.
> > =A0 =A0 =A0 =A0 =A0 =A0Question:
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 How to make sure we do not co= py corrupt files if the
> > snapshot/transaction log is in the middle of update? Do we have t= o
> shutdown
> > the node to make the copy?
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 besides the transaction log a= nd snapshot, do we have to
> > copy other files such as the ecoch files
> >
> > For the disaster recovery procedures at zookeeper assemble:
> > (1) recreate the machines for the zookeeper ensemble
> > (2) copy snapshot/transaction log we backed up into the zookeeper=
> > dataDir\version-2 and logDir\version2.
> > =A0 =A0 =A0 =A0 =A0 =A0Question:
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Do we have to copy the epoch f= iles?
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Do we have to copy snapshot/tr= ansaction log backed up to
> > all the zookeeper node, or just the first node we starts?
> >
> > Appreciate your time and help.
> > Jack
> >
>

--047d7bd6bc92a3492904e1e22449-- --047d7bd6bc92a3492c04e1e2244b--