hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@fb.com>
Subject RE: Modifying existing table entries
Date Tue, 14 Dec 2010 08:57:01 GMT
Hey Adam,

Do you need to scan all of the entries in order to know which ones you need to change the
expiration of?  Or do you have that information as an input?

As for why you can't insert an older version, it is because HBase sorts all columns in descending
version order regardless of insertion order.  In order to make the latest timestamp of a column
older than an existing version, you would need to do an explicit delete of the existing version:

   Delete.deleteColumn(byte [] family, byte [] qualifier, long timestamp)

An alternative approach would be to allow storing multiple versions of your columns.  At read
time, you would get all versions and could resolve which to use based on some piece of metadata
you could store (with the real timestamp so you know which is latest).

If you're going to need fine-grained control and flexibility on TTL policies, you might just
set HBase to the maximum possible and rely on application logic / metadata stored in HBase.

I'm not sure what exactly the load patterns or requirements are for your application so not
sure what the best approach might be.  I commend you for a creative use of TTLs and versioning
:)

JG


> -----Original Message-----
> From: Adam Phelps [mailto:amp@opendns.com]
> Sent: Monday, December 13, 2010 4:03 PM
> To: user@hbase.apache.org
> Subject: Re: Modifying existing table entries
> 
> On 12/13/10 11:11 AM, Adam Phelps wrote:
> > Does anyone have suggestions regarding the best way to modify existing
> > entries in a table?
> >
> > We have our tables set up such that when we create an entry we set its
> > timestamp such that the entry has a rough expiration time, ie we have
> > a TTL on the table as a whole and then adjust the time stamp so that
> > HBase will clean up the entry approximately when we wish.
> >
> > However there are some rare situations where we would like to change
> > that expiration time on a subset of the entries (typically either to
> > have them expire immediately or to extend their life).
> >
> > My current thought is to use TableMapReduceUtil to run a MR job
> > against a table, filter out just the keys I need to change, create
> > copies of the KeyValue's for that key with a new timestamp, and write
> > them back out using the existing keys. Would something along these lines
> work?
> >
> > Alternately is there some better way to do this that I haven't seen yet?
> 
> I tried implementing this, and it seems to only mostly work.  This method
> appears to allow a timestamp to be increased but not decreased.
> 
> I'm guessing that HBase is trying to prevent "older" entries from accidentally
> overwriting "newer" ones.  Does anyone know if there's a way to overcome
> this?  Or suggestions on other ways to have entry-specific TTLs?
> 
> - Adam

Mime
View raw message