Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 14849 invoked from network); 10 Mar 2011 21:49:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Mar 2011 21:49:39 -0000 Received: (qmail 64060 invoked by uid 500); 10 Mar 2011 21:49:38 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 63915 invoked by uid 500); 10 Mar 2011 21:49:38 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 63907 invoked by uid 99); 10 Mar 2011 21:49:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 21:49:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of saint.ack@gmail.com designates 74.125.82.51 as permitted sender) Received: from [74.125.82.51] (HELO mail-ww0-f51.google.com) (74.125.82.51) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 21:49:30 +0000 Received: by wwj40 with SMTP id 40so2072605wwj.20 for ; Thu, 10 Mar 2011 13:49:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=3Y+pt0qDPu4UYA9cJMGopbOHVtsZhc9pOGRr2yc/oEM=; b=fc50IXF5KH9Ycp9fVjDQroNc321bGWKgQdhLTfcO1nQ6bANLz+9YOEhgamzkWMzxqe bALxXxcZuKhYG6xiQwWJG0Q3qyCArxZCQ3JNzKy5OvUbDLhBfQvQwReJSiUEbMz4sMae Fl/u4zh27YsW9KpZDbWP28++WI7uTXDloT/oU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=Jt2tPHfAlnZyvE0zQMQ0hT0JF+djvs1T7DFo49GUS9oEtpDqJPkkxrt02Wqozau5Rm 9B58VMIH4YH2yX7eaGY/WzeSo997jQc9vd2JZsL1ECZACHsysKCMb2baOgAndT6Vvl77 H+S7f5Pgln6mPz/o7fv5QvjnOBaLoo0u4GW40= MIME-Version: 1.0 Received: by 10.216.18.194 with SMTP id l44mr1353097wel.87.1299793652771; Thu, 10 Mar 2011 13:47:32 -0800 (PST) Sender: saint.ack@gmail.com Received: by 10.216.6.142 with HTTP; Thu, 10 Mar 2011 13:47:32 -0800 (PST) In-Reply-To: <471587.49260.qm@web130120.mail.mud.yahoo.com> References: <471587.49260.qm@web130120.mail.mud.yahoo.com> Date: Thu, 10 Mar 2011 13:47:32 -0800 X-Google-Sender-Auth: o-jqawfPsuqYWQnMgWUqaDEmzp0 Message-ID: Subject: Re: A list of HBase backup options From: Stack To: user@hbase.apache.org Cc: Otis Gospodnetic Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Mar 10, 2011 at 12:31 PM, Otis Gospodnetic wrote: >> Options 1) and 2) will give you a snapshot on a table at a =A0particular >> instance in time. =A0You'll get the state of the row at the =A0time the >> MapReduce job crosses that row. > > Hm, isn't this contradictory? =A0That is, doesn't "snapshot of a table at= a > particular instance in time" means that I'd get a snapshot of *all* rows = at a > single point in time, and not a value of a row when the Export or Copy MR= job > crosses it? > Sorry for the sloppy phrasing. Being a distributed system, getting a consistent view on a table at a particular moment in time would be a little tough. The only thing we guarantee -- currently w/ some caveats, HBASE-2856 -- is a consistent at the row level only. > Also, it seems like all options are per-table, right? =A0There is nothing= other > than near real-time full-cluster replication that would back up all table= s at > once? Right. > This is important when you have multiple tables storing data that depend = on each > other. =A0Imagine tables A and B where table B depends on A. =A0If you fi= rst back up > A, then by the time I back up B, it may reference some data in A that my = A's > backup doesn't contain. =A0If you flip the order and first back up B, the= n by the > time I back up A it may contain some extra data that B's backup doesn't r= efer > to. > Yes. > Simply put, the backup copies of these 2 tables won't be in sync. > > How do people deal with this? > If you want them in sync, you are into the world of cross-row, cross-table transactions. Replicating the tables will eventually be consistent with each other (Replication is edit-scoped, not table or even x-table scoped). > Would it make sense to document this sort of stuff on > http://hbase.apache.org/book/book.html ? > You mean the list of backup options? Yes. And their individual failings/constraints. (Otis, in this list you've made other useful 'lists' -- the reporting one for instance -- that I've put on my 'doc this' list, my list of things to add into the manual when have a minute). St.Ack