hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ding, Hui" <hui.d...@sap.com>
Subject RE: [LIKELY JUNK]Back Up Strategies
Date Mon, 22 Sep 2008 19:20:17 GMT
This should be something the operators of your data store worriy about. 
E.g., say hdfs uses three replicas, one should be on a local rack, the
other on a different rack (to protect against power outage)
And a third on a remote data center...

If you have only a small cluster, then maybe use ups to guard against
power outage and watch out for storms? 
After all, what are the chances that a meteorite hit your data center?

-----Original Message-----
From: Charles Mason [mailto:charlie.mas@gmail.com] 
Sent: Monday, September 22, 2008 12:13 PM
To: hbase-user@hadoop.apache.org
Subject: [LIKELY JUNK]Back Up Strategies

Hi All,

I was wondering what the options there are for backup and dumping an
HBase database. I appreciate that having it run on top of a HDFS
cluster can protect against individual node failure. However that
still doesn't protect against the massive but thankfully rare
disasters which take out whole server racks, fire, floods, etc...

As far as I can tell there are two options:

1, Scan each table and dump the entire row to some external location,
like MySQL Dump does for MySQL. Then to recover simply put the new
data back. I am sure the performance of this is going to be fairly

2, Image the data stored on the HDFS cluster. Aren't there some big
issues with it not grabbing a consistent image as some updates won't
be flushed? Is there any way to force that, or to make it be
consistent some way, perhaps via snapshoting?

Have I missed anything? Anyone got any suggestions?

Charlie M

View raw message