hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: A list of HBase backup options
Date Thu, 10 Mar 2011 21:47:32 GMT
On Thu, Mar 10, 2011 at 12:31 PM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
>> Options 1) and 2) will give you a snapshot on a table at a  particular
>> instance in time.  You'll get the state of the row at the  time the
>> MapReduce job crosses that row.
>
> Hm, isn't this contradictory?  That is, doesn't "snapshot of a table at a
> particular instance in time" means that I'd get a snapshot of *all* rows at a
> single point in time, and not a value of a row when the Export or Copy MR job
> crosses it?
>

Sorry for the sloppy phrasing.

Being a distributed system, getting a consistent view on a table at a
particular moment in time would be a little tough.  The only thing we
guarantee -- currently w/ some caveats, HBASE-2856 -- is a consistent
at the row level only.


> Also, it seems like all options are per-table, right?  There is nothing other
> than near real-time full-cluster replication that would back up all tables at
> once?

Right.


> This is important when you have multiple tables storing data that depend on each
> other.  Imagine tables A and B where table B depends on A.  If you first back up
> A, then by the time I back up B, it may reference some data in A that my A's
> backup doesn't contain.  If you flip the order and first back up B, then by the
> time I back up A it may contain some extra data that B's backup doesn't refer
> to.
>

Yes.


> Simply put, the backup copies of these 2 tables won't be in sync.
>
> How do people deal with this?
>

If you want them in sync, you are into the world of cross-row,
cross-table transactions.  Replicating the tables will eventually be
consistent with each other (Replication is edit-scoped, not table or
even x-table scoped).

> Would it make sense to document this sort of stuff on
> http://hbase.apache.org/book/book.html ?
>

You mean the list of backup options?  Yes.  And their individual
failings/constraints.

(Otis, in this list you've made other useful 'lists' -- the reporting
one for instance -- that I've put on my 'doc this' list, my list of
things to add into the manual when have a minute).

St.Ack

Mime
View raw message