hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Recommended backup/restore solution for hbase
Date Wed, 28 Sep 2011 21:13:40 GMT
You can now (0.92+) set the minium number of versions you want to always keep around together
with TTL. See HBASE-4071 


-- Lars



________________________________
From: "Buttler, David" <buttler1@llnl.gov>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Sent: Wednesday, September 28, 2011 2:10 PM
Subject: RE: Recommended backup/restore solution for hbase

Wouldn't using a TTL on your data automatically delete data that is older than X months? 
Of course major compactions have to occur to get the data to automatically disappear.

See:
http://hbase.apache.org/book.html#ttl
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html#HColumnDescriptor(byte[],
int, java.lang.String, boolean, boolean, int, int, java.lang.String, int)

Dave

-----Original Message-----
From: tvinod@socialyantra.com [mailto:tvinod@socialyantra.com] On Behalf Of Vinod Gupta Tankala
Sent: Wednesday, September 28, 2011 12:12 PM
To: user@hbase.apache.org
Subject: Re: Recommended backup/restore solution for hbase

thanks Li. I didn't know about using S3 as a datastore. Will look into this
more.

I understand that hdfs replication will help in partial hardware failure. I
wanted to protect myself against inconsistencies as I have gotten bitten in
the past. That had happened due to hbase fatal exceptions. One of the
reasons for that could have been due to standalone mode as that is not
production ready, based on reading hbase documentation.
Another use case I have is - I would be writing sweeper jobs to delete user
data that is more than x months old. So in case, we need to retrieve old
user data, I would like to have the ability to get old data back from
exported tables. Ofcourse, I understand that to do so for selective user
accounts, I have to write custom jobs.

thanks
vinod

On Wed, Sep 28, 2011 at 11:49 AM, Li Pi <lpi@ucsd.edu> wrote:

> What kind of situations are you looking for to guard against? Partial
> hardware failure, full hardware failure (of live cluster),
> accidentally deleting all data?
>
> HDFS provides replication that already guards against partial hardware
> failure - if this is all you need, a ephemeral store should be  fine.
>
> Also, HBase can use S3 directly as a datastore. You can choose the raw
> mode, in which HBase treats S3 as a disk. There used to be a block
> based mode as well, but now as S3 has increased the object size limit
> to 5tb, this isn't needed anymore. (Somebody correct me if i'm wrong).
>
> On Wed, Sep 28, 2011 at 9:15 AM, Vinod Gupta Tankala
> <tvinod@readypulse.com> wrote:
> > Hi,
> > Can someone answer these basic but important questions for me.
> > We are using hbase for our datastore and want to safeguard ourselves from
> > data corruption/data loss. Also we are hosted on aws ec2. Currently, I
> only
> > have a single node but want to prepare for scale right away as things are
> > going to change starting next couple of weeks. Also, I am currently using
> > ephemeral store for hbase data.
> >
> > 1) What is the recommended aws data store method for hbase? should you
> use
> > ephemeral store and do S3 backups or use EBS? I read and heard that EBS
> can
> > be expensive and also unreliable in terms of read/write latency.
> Ofcourse,
> > it provides data replication and protection, so you don't have to worry
> > about that.
> >
> > 2) What is the recommended backup/restore method for hbase? I would like
> to
> > take periodic data snapshots and then have a import utility that will
> > incrementally import data in case i lose some regions due to corruption
> or
> > table inconsistencies. also, if something catastrophic happens, i can
> > restore the whole data.
> >
> > 3) While we are at it, what is the recommended ec2 instance types for
> > running master/zookeeper/region servers? i get conflicting answers from
> > google search - ranging from c1.xlarge to m1.xlarge.
> >
> > I would really appreciate if someone could help me.
> >
> > thanks
> > vinod
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message