hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: hbase doesn't delete data older than TTL in old regions
Date Wed, 15 Sep 2010 18:43:27 GMT
I feel the need to pipe in here, since people are accusing hbase of
having a broken feature 'TTL' when from the description in this email
thread, and my own knowledge doesn't really describe a broken feature.
 Non optimal maybe, but not broken.

First off, the TTL feature works on the timestamp, thus rowkey
structure is not related.  This is because the timestamp is stored in
a different field.  If you are also storing the data in row key
chronological order, then you may end up with sparse or 'small'
regions.  But that doesn't mean the feature is broken - ie: it does
not remove data older than the TTL.  Needs tuning yes, but not broken.

Also note that "client side deletes" work in the same way that TTL
does, you insert a tombstone marker, then a compaction actually purges
the data itself.

-ryan

On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <jinsong_hu@hotmail.com> wrote:
> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to track
> issue. dropping old store , and update the adjacent region's key range when
> all
> store for a region is gone is probably the cheapest solution, both in terms
> of coding and in terms of resource usage in the cluster. Do we know when
> this can be done ?
>
>
> Jimmy.
>
> --------------------------------------------------
> From: "Jonathan Gray" <jgray@facebook.com>
> Sent: Wednesday, September 15, 2010 11:06 AM
> To: <user@hbase.apache.org>
> Subject: RE: hbase doesn't delete data older than TTL in old regions
>
>> This sounds reasonable.
>>
>> We are tracking min/max timestamps in storefiles too, so it's possible
>> that we could expire some files of a region as well, even if the region was
>> not completely expired.
>>
>> Jinsong, mind filing a jira?
>>
>> JG
>>
>>> -----Original Message-----
>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>>> Sent: Wednesday, September 15, 2010 10:39 AM
>>> To: user@hbase.apache.org
>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>
>>> Yes, Current TTL based on compaction is working as advertised if the
>>> key
>>> randomly distribute the incoming data
>>> among all regions.  However, if the key is designed in chronological
>>> order,
>>> the TTL doesn't really work, as  no compaction
>>> will happen for data already written. So we can't say  that current TTL
>>> really work as advertised, as it is key structure dependent.
>>>
>>> This is a pity, because a major use case for hbase is for people to
>>> store
>>> history or log data. normally people only
>>> want to retain the data for a fixed period. for example, US government
>>> default data retention policy is 7 years. Those
>>> data are saved in chronological order. Current TTL implementation
>>> doesn't
>>> work at all for those kind of use case.
>>>
>>> In order for that use case to really work, hbase needs to have an
>>> active
>>> thread that periodically runs and check if there
>>> are data older than TTL, and delete the data older than TTL is
>>> necessary,
>>> and compact small regions older than certain time period
>>> into larger ones to save system resource. It can optimize the deletion
>>> by
>>> delete the whole region if it detects that the last time
>>> stamp for the region is older than TTL.  There should be 2 parameters
>>> to
>>> configure for hbase:
>>>
>>> 1. whether to disable/enable the TTL thread.
>>> 2. the interval that TTL will run. maybe we can use a special value
>>> like 0
>>> to indicate that we don't run the TTL thread, thus saving one
>>> configuration
>>> parameter.
>>> for the default TTL, probably it should be set to 1 day.
>>> 3. How small will the region be merged. it should be a percentage of
>>> the
>>> store size. for example, if 2 consecutive region is only 10% of the
>>> store
>>> szie ( default is 256M), we can initiate a region merge.  We probably
>>> need a
>>> parameter to reduce the merge too. for example , we only merge for
>>> regions
>>> who's largest timestamp
>>> is older than half of TTL.
>>>
>>>
>>> Jimmy
>>>
>>> --------------------------------------------------
>>> From: "Stack" <stack@duboce.net>
>>> Sent: Wednesday, September 15, 2010 10:08 AM
>>> To: <user@hbase.apache.org>
>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>
>>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <jinsong_hu@hotmail.com>
>>> > wrote:
>>> >> I have tested the TTL for hbase and found that it relies on
>>> compaction to
>>> >> remove old data . However, if a region has data that is older
>>> >> than TTL, and there is no trigger to compact it, then the data will
>>> >> remain
>>> >> there forever, wasting disk space and memory.
>>> >>
>>> >
>>> > So its working as advertised then?
>>> >
>>> > There's currently an issue where we can skip major compactions if
>>> your
>>> > write loading has a particular character: hbase-2990.
>>> >
>>> >
>>> >> It appears at this state, to really remove data older than TTL we
>>> need to
>>> >> start a client side deletion request.
>>> >
>>> > Or run a manual major compaction:
>>> >
>>> > $ echo "major_compact TABLENAME" | ./bin/hbase shell
>>> >
>>> >
>>> >
>>> > This is really a pity because
>>> >> it is an more expensive way to get the job done.  Another side
>>> effect of
>>> >> this is that as time goes on, we will end up with some small
>>> >> regions if the data are saved in chronological order in regions. It
>>> >> appears
>>> >> that hbase doesn't have a mechanism to merge 2 consecutive
>>> >> small regions into a bigger one at this time.
>>> >
>>> > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
>>> > Usage: bin/hbase merge <table-name> <region-1> <region-2>
>>> >
>>> > Currently only works on offlined table but there's a patch available
>>> > to make it run against onlined regions.
>>> >
>>> >
>>> > So if data is saved in
>>> >> chronological order, sooner or later we will run out of capacity ,
>>> even
>>> >> if
>>> >> the amount of data in hbase is small, because we have lots of
>>> regions
>>> >> with
>>> >> small storage space.
>>> >>
>>> >> A much cheaper way to remove data older than TTL would be to
>>> remember the
>>> >> latest timestamp for the region in the .META. table
>>> >> and if the time is older than TTL, we just adjust the row in .META.
>>> and
>>> >> delete the store , without doing any compaction.
>>> >>
>>> >
>>> > Say more on the above.  It sounds promising.  Are you suggesting that
>>> > in addition to compactions that we also have a provision where we
>>> keep
>>> > account of a storefiles latest timestamp (we already do this I
>>> > believe) and that when now - storefile-timestamp > ttl, we just
>>> remove
>>> > the storefile wholesale.  That sounds like it could work, if that is
>>> > what you are suggesting.  Mind filing an issue w/ a detailed
>>> > description?
>>> >
>>> > Thanks,
>>> > St.Ack
>>> >
>>> >
>>> >
>>> >> Can this be added to the hbase requirement for future release ?
>>> >>
>>> >> Jimmy
>>> >>
>>> >>
>>> >>
>>> >
>>
>

Mime
View raw message