hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jameson Lopp <jame...@bronto.com>
Subject Re: setTimeRange for HBase Increment
Date Tue, 04 Oct 2011 18:14:26 GMT
Thanks, that makes sense. Unfortunately, it sounds like this feature is 
unable to solve my particular problem...
--
Jameson Lopp
Software Engineer
Bronto Software, Inc

On 10/04/2011 01:36 PM, Gary Helmling wrote:
> Jameson,
>
> The TimeRange you set on the Increment is used in looking up the previous
> value that you'll be incrementing.  It's not stored with the incremented
> value as a data "lifetime" or anything.  If a previously stored value is
> found within the given time range, it will be incremented.  If no value is
> found within that range, a new value is stored with using the value from
> your Increment.
>
> As other have already covered, if you're looking for auto-cleanup of data
> you would set a TTL on the column family.
>
> So let me tweak your scenario a bit to explain how it might work:
>
> 0) Say you have a previous value on column "c1" of 2, last incremented 31
> days ago
>
> 1) You perform an increment on "c1" with a value of 1, minStamp = now - 30
> days, maxStamp = now
>
> 2) There is now a new version of "c1", with value=1, timestamp=now.  The
> previous version, with value=2, timestamp=now - 31 days, still exists and
> may be automatically cleaned up, subject to your settings for max versions
> and TTL.  So you would have:
>
> c1:
>    - v2: ts=now, value=1
>    - v1: ts=now-31days, value=2
>
> 3) Reading the current value of "c1" will return 1
>
> 4a) If you repeat step #1 in 31 days from now, you would wind up with a
> third version of "c1", again with value=1:
>
> c1:
>    - v3: ts=now, value=1
>    - v2: ts=now-31days, value=1
>    - v1: ts=now-62days, value=2
>
> 4b) If you instead repeat step #1 31 days from now, but using minStamp=now -
> 60 days, maxStamp=now, then you would be incrementing the existing "v2" of
> "c1", since it falls within the time range:
>
> c1:
>    - v2: ts=now, value=2
>    - v1: ts=now-62days, value=2
>
>
> I hope this clarifies things.
>
> --gh
>
>
> On Thu, Sep 29, 2011 at 12:40 PM, Jameson Lopp<jameson@bronto.com>  wrote:
>
>> Thanks! Nevertheless, can anyone confirm / deny if the scenario I described
>> would play out in that manner? Just want to make sure I understand the
>> functionality.
>>
>>
>> --
>> Jameson Lopp
>> Software Engineer
>> Bronto Software, Inc
>>
>> On 09/29/2011 03:32 PM, Doug Meil wrote:
>>
>>>
>>> Here are a few links on table cleanup and major compactions...
>>>
>>> http://hbase.apache.org/book.**html#schema.minversions<http://hbase.apache.org/book.html#schema.minversions>
  (ttl related)
>>>
>>> http://hbase.apache.org/book.**html#perf.deleting.queue<http://hbase.apache.org/book.html#perf.deleting.queue>
>>>
>>> http://hbase.apache.org/book.**html#compaction<http://hbase.apache.org/book.html#compaction>
>>>
>>>
>>>
>>>
>>>
>>> On 9/29/11 2:29 PM, "Ted Yu"<yuzhihong@gmail.com>   wrote:
>>>
>>>   Doug Meil may point you to related doc.
>>>>
>>>> Take a look at this as well:
>>>> https://issues.apache.org/**jira/browse/HBASE-4241<https://issues.apache.org/jira/browse/HBASE-4241>
>>>>
>>>> On Thu, Sep 29, 2011 at 11:22 AM, Jameson Lopp<jameson@bronto.com>
>>>>   wrote:
>>>>
>>>>   Hm, well I didn't mention a number of other requirements for the feature
>>>>> I'm building, but long story short, I need to keep track of millions
to
>>>>> billions of these counters and need the lookup time to be as close to
>>>>> constant time as possible, thus I was really hoping to avoid doing table
>>>>> scans.
>>>>>
>>>>> I'll admit I know nothing of the dangers of auto-pruning; is there an
>>>>> article / documentation I could read about it? Google wasn't very
>>>>> helpful.
>>>>>
>>>>>
>>>>> --
>>>>> Jameson Lopp
>>>>> Software Engineer
>>>>> Bronto Software, Inc
>>>>>
>>>>>
>>>>> On 09/29/2011 02:12 PM, Jean-Daniel Cryans wrote:
>>>>>
>>>>>   My advice usually regarding timestamps is if it's part of your data
>>>>>> model, it should appear somewhere in an HBase key. 99% of the time
>>>>>> overloading the HBase timestamps is a bad idea, especially with
>>>>>> counters since there's auto-pruning done in the Memstore!
>>>>>>
>>>>>> I would suggest you make time part of your row key, maybe one counter
>>>>>> per day, and then set the TTL on your table to 30 days. Then all
you
>>>>>> need to do is a sequential scan for those 30 days maybe with a prefix
>>>>>> that refers to some event id.
>>>>>>
>>>>>> OpenTSDB is another way of doing it: http://opentsdb.net/
>>>>>>
>>>>>> J-D
>>>>>>
>>>>>> On Thu, Sep 29, 2011 at 11:04 AM, Jameson Lopp<jameson@bronto.com>
>>>>>>   wrote:
>>>>>>
>>>>>>   I wish to store a count of 30-day trailing event data (e.g. # of
>>>>>>> clicks
>>>>>>> in
>>>>>>> past 30 days) and ended up reading the documentation for setTimeRange
>>>>>>> in
>>>>>>> the
>>>>>>> Increment operation.
>>>>>>> http://hbase.apache.org/****apidocs/org/apache/hadoop/**<http://hbase.apache.org/**apidocs/org/apache/hadoop/**>
>>>>>>>
>>>>>>> hbase/client/Increment.html#****getTimeRange%28%29<http://**
>>>>>>> hbase.apache.or<http://hbase.apache.or>
>>>>>>> g/apidocs/org/apache/hadoop/**hbase/client/Increment.html#**
>>>>>>> getTimeRange%28
>>>>>>> %29>
>>>>>>>
>>>>>>> I was hoping someone could clarify if it works as I'm imagining
in
>>>>>>> this
>>>>>>> example scenario.
>>>>>>>
>>>>>>> 1) Current click count is 0
>>>>>>>
>>>>>>> 2) I process a click and I perform an increment operation with
the
>>>>>>> time
>>>>>>> range set to minStamp = now and maxStamp = 30 days from now
>>>>>>>
>>>>>>> 3) I query for the value immediately and find it to be 1
>>>>>>>
>>>>>>> 4) Assuming no other clicks come in, if I query for the value
in 31
>>>>>>> days,
>>>>>>> it
>>>>>>> will be returned as 0
>>>>>>>
>>>>>>> In essence, I'm looking for a way to set a TTL on my increment
>>>>>>> operation.
>>>>>>> Is
>>>>>>> this how it actually works? The documentation is a bit vague
and I
>>>>>>> could
>>>>>>> imagine several other scenarios.
>>>>>>> --
>>>>>>> Jameson Lopp
>>>>>>> Software Engineer
>>>>>>> Bronto Software, Inc
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>

Mime
View raw message