hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Custom Retention of data based on Rowkey
Date Thu, 09 Mar 2017 12:49:56 GMT
>From 0.98.9 version onward, there a per cell TTL feature available.
(See HBASE-10560)..  TTL can be controlled even per cell level.

It might eat up more space as the TTL is stored with every cell.  What
u want is same TTL for one group and another for another.  The major
concern is u need "dynamically updatable" TTL..  Per cell TTL can not
do this.

So u might need a custom compaction policy plugged in.. Might be like
u need to decorate the HBase compaction policy using CPs.  Also to
remember that u will need some CP work in read (Scan/Get) to make sure
TTL expired data is not retrieved back (Compaction might be not
happened yet but data is TTL expired. Actually at HBase level u dont
give any TTL)

-Anoop-

On Thu, Mar 9, 2017 at 4:34 PM, Gaurav Agarwal <gauravagarwal4@gmail.com> wrote:
> Hi,
>
> Looking deeper, I found that RegionObserver interface provides general hooks to intercept
the pre-compaction scanner. That should suffice for our purpose!
>
> In any case, if there are any suggestions/guidelines, it will be much appreciated.
>
>
>
> From: Gaurav Agarwal <gauravagarwal4@gmail.com>
> Date: Thursday, 9 March 2017 at 2:08 PM
> To: <user@hbase.apache.org>
> Cc: Kshitij Gupta <kshitijg@vmware.com>, Mukul Gupta <mukulg@vmware.com>
> Subject: Custom Retention of data based on Rowkey
>
>
>
> Hi All,
>
>
>
> We have an application that stores information on multiple users/customers/tenants in
a common table. Each tenant has a unique id which we encode in the row key of the records
that are stored in the table.
>
>
>
> We want to apply custom (and dynamically updatable) data retention policies for each
tenant.  What would be a reasonable way to achieve that?
>
>
>
> Searching through forums, I came across this link that suggests to either write an external
process to retrieve and delete cells based on the retention policy or write a custom compaction
policy:
>
> https://community.hortonworks.com/questions/14883/best-way-to-achieve-custom-retention-of-some-rows.html
>
>
>
> We felt that writing an external scanner for managing retention would be simpler but
very inefficient as it would require getting the entire data set out of the hbase server and
then issuing delete calls back to it.
>
>
>
> Does any one know if there has been any recent progress on this aspect of data retention
in hbase?
>
>
>
> Additionally, if I go the route of writing my own custom compaction policy, what would
be the best place to start? Maybe I could copy/extend the “default” hbase compaction policy
and enhance it to look at rowkey inside every Cell to make a call if the cell needs to be
deleted?
>
>
>
> --
>
> cheers,
>
> gaurav
>
>
>
>
>

Mime
View raw message