hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: hbase doesn't delete data older than TTL in old regions
Date Thu, 16 Sep 2010 00:29:23 GMT
Yeah, indeed the TTL feature is not broken. It works as "advertised" if you understand how
HBase internals work. 

But we can accommodate the expectations communicated on this thread, it sounds reasonable.

    - Andy


--- On Wed, 9/15/10, Ryan Rawson <ryanobjc@gmail.com> wrote:

> From: Ryan Rawson <ryanobjc@gmail.com>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> To: user@hbase.apache.org
> Date: Wednesday, September 15, 2010, 11:43 AM
> I feel the need to pipe in here,
> since people are accusing hbase of
> having a broken feature 'TTL' when from the description in
> this email
> thread, and my own knowledge doesn't really describe a
> broken feature.
>  Non optimal maybe, but not broken.
> 
> First off, the TTL feature works on the timestamp, thus
> rowkey
> structure is not related.  This is because the
> timestamp is stored in
> a different field.  If you are also storing the data
> in row key
> chronological order, then you may end up with sparse or
> 'small'
> regions.  But that doesn't mean the feature is broken
> - ie: it does
> not remove data older than the TTL.  Needs tuning yes,
> but not broken.
> 
> Also note that "client side deletes" work in the same way
> that TTL
> does, you insert a tombstone marker, then a compaction
> actually purges
> the data itself.
> 
> -ryan
> 
> On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <jinsong_hu@hotmail.com>
> wrote:
> > I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
> track
> > issue. dropping old store , and update the adjacent
> region's key range when
> > all
> > store for a region is gone is probably the cheapest
> solution, both in terms
> > of coding and in terms of resource usage in the
> cluster. Do we know when
> > this can be done ?
> >
> >
> > Jimmy.
> >
> > --------------------------------------------------
> > From: "Jonathan Gray" <jgray@facebook.com>
> > Sent: Wednesday, September 15, 2010 11:06 AM
> > To: <user@hbase.apache.org>
> > Subject: RE: hbase doesn't delete data older than TTL
> in old regions
> >
> >> This sounds reasonable.
> >>
> >> We are tracking min/max timestamps in storefiles
> too, so it's possible
> >> that we could expire some files of a region as
> well, even if the region was
> >> not completely expired.
> >>
> >> Jinsong, mind filing a jira?
> >>
> >> JG
> >>
> >>> -----Original Message-----
> >>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> >>> Sent: Wednesday, September 15, 2010 10:39 AM
> >>> To: user@hbase.apache.org
> >>> Subject: Re: hbase doesn't delete data older
> than TTL in old regions
> >>>
> >>> Yes, Current TTL based on compaction is
> working as advertised if the
> >>> key
> >>> randomly distribute the incoming data
> >>> among all regions.  However, if the key is
> designed in chronological
> >>> order,
> >>> the TTL doesn't really work, as  no
> compaction
> >>> will happen for data already written. So we
> can't say  that current TTL
> >>> really work as advertised, as it is key
> structure dependent.
> >>>
> >>> This is a pity, because a major use case for
> hbase is for people to
> >>> store
> >>> history or log data. normally people only
> >>> want to retain the data for a fixed period.
> for example, US government
> >>> default data retention policy is 7 years.
> Those
> >>> data are saved in chronological order. Current
> TTL implementation
> >>> doesn't
> >>> work at all for those kind of use case.
> >>>
> >>> In order for that use case to really work,
> hbase needs to have an
> >>> active
> >>> thread that periodically runs and check if
> there
> >>> are data older than TTL, and delete the data
> older than TTL is
> >>> necessary,
> >>> and compact small regions older than certain
> time period
> >>> into larger ones to save system resource. It
> can optimize the deletion
> >>> by
> >>> delete the whole region if it detects that the
> last time
> >>> stamp for the region is older than TTL.
>  There should be 2 parameters
> >>> to
> >>> configure for hbase:
> >>>
> >>> 1. whether to disable/enable the TTL thread.
> >>> 2. the interval that TTL will run. maybe we
> can use a special value
> >>> like 0
> >>> to indicate that we don't run the TTL thread,
> thus saving one
> >>> configuration
> >>> parameter.
> >>> for the default TTL, probably it should be set
> to 1 day.
> >>> 3. How small will the region be merged. it
> should be a percentage of
> >>> the
> >>> store size. for example, if 2 consecutive
> region is only 10% of the
> >>> store
> >>> szie ( default is 256M), we can initiate a
> region merge.  We probably
> >>> need a
> >>> parameter to reduce the merge too. for example
> , we only merge for
> >>> regions
> >>> who's largest timestamp
> >>> is older than half of TTL.
> >>>
> >>>
> >>> Jimmy
> >>>
> >>>
> --------------------------------------------------
> >>> From: "Stack" <stack@duboce.net>
> >>> Sent: Wednesday, September 15, 2010 10:08 AM
> >>> To: <user@hbase.apache.org>
> >>> Subject: Re: hbase doesn't delete data older
> than TTL in old regions
> >>>
> >>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong
> Hu <jinsong_hu@hotmail.com>
> >>> > wrote:
> >>> >> I have tested the TTL for hbase and
> found that it relies on
> >>> compaction to
> >>> >> remove old data . However, if a
> region has data that is older
> >>> >> than TTL, and there is no trigger to
> compact it, then the data will
> >>> >> remain
> >>> >> there forever, wasting disk space and
> memory.
> >>> >>
> >>> >
> >>> > So its working as advertised then?
> >>> >
> >>> > There's currently an issue where we can
> skip major compactions if
> >>> your
> >>> > write loading has a particular character:
> hbase-2990.
> >>> >
> >>> >
> >>> >> It appears at this state, to really
> remove data older than TTL we
> >>> need to
> >>> >> start a client side deletion
> request.
> >>> >
> >>> > Or run a manual major compaction:
> >>> >
> >>> > $ echo "major_compact TABLENAME" |
> ./bin/hbase shell
> >>> >
> >>> >
> >>> >
> >>> > This is really a pity because
> >>> >> it is an more expensive way to get
> the job done.  Another side
> >>> effect of
> >>> >> this is that as time goes on, we will
> end up with some small
> >>> >> regions if the data are saved in
> chronological order in regions. It
> >>> >> appears
> >>> >> that hbase doesn't have a mechanism
> to merge 2 consecutive
> >>> >> small regions into a bigger one at
> this time.
> >>> >
> >>> > $ ./bin/hbase
> org.apache.hadoop.hbase.util.Merge
> >>> > Usage: bin/hbase merge <table-name>
> <region-1> <region-2>
> >>> >
> >>> > Currently only works on offlined table
> but there's a patch available
> >>> > to make it run against onlined regions.
> >>> >
> >>> >
> >>> > So if data is saved in
> >>> >> chronological order, sooner or later
> we will run out of capacity ,
> >>> even
> >>> >> if
> >>> >> the amount of data in hbase is small,
> because we have lots of
> >>> regions
> >>> >> with
> >>> >> small storage space.
> >>> >>
> >>> >> A much cheaper way to remove data
> older than TTL would be to
> >>> remember the
> >>> >> latest timestamp for the region in
> the .META. table
> >>> >> and if the time is older than TTL, we
> just adjust the row in .META.
> >>> and
> >>> >> delete the store , without doing any
> compaction.
> >>> >>
> >>> >
> >>> > Say more on the above.  It sounds
> promising.  Are you suggesting that
> >>> > in addition to compactions that we also
> have a provision where we
> >>> keep
> >>> > account of a storefiles latest timestamp
> (we already do this I
> >>> > believe) and that when now -
> storefile-timestamp > ttl, we just
> >>> remove
> >>> > the storefile wholesale.  That sounds
> like it could work, if that is
> >>> > what you are suggesting.  Mind filing an
> issue w/ a detailed
> >>> > description?
> >>> >
> >>> > Thanks,
> >>> > St.Ack
> >>> >
> >>> >
> >>> >
> >>> >> Can this be added to the hbase
> requirement for future release ?
> >>> >>
> >>> >> Jimmy
> >>> >>
> >>> >>
> >>> >>
> >>> >
> >>
> >
> 


      


Mime
View raw message