hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: setTimeRange for HBase Increment
Date Thu, 29 Sep 2011 18:29:26 GMT
Doug Meil may point you to related doc.

Take a look at this as well:
https://issues.apache.org/jira/browse/HBASE-4241

On Thu, Sep 29, 2011 at 11:22 AM, Jameson Lopp <jameson@bronto.com> wrote:

> Hm, well I didn't mention a number of other requirements for the feature
> I'm building, but long story short, I need to keep track of millions to
> billions of these counters and need the lookup time to be as close to
> constant time as possible, thus I was really hoping to avoid doing table
> scans.
>
> I'll admit I know nothing of the dangers of auto-pruning; is there an
> article / documentation I could read about it? Google wasn't very helpful.
>
>
> --
> Jameson Lopp
> Software Engineer
> Bronto Software, Inc
>
>
> On 09/29/2011 02:12 PM, Jean-Daniel Cryans wrote:
>
>> My advice usually regarding timestamps is if it's part of your data
>> model, it should appear somewhere in an HBase key. 99% of the time
>> overloading the HBase timestamps is a bad idea, especially with
>> counters since there's auto-pruning done in the Memstore!
>>
>> I would suggest you make time part of your row key, maybe one counter
>> per day, and then set the TTL on your table to 30 days. Then all you
>> need to do is a sequential scan for those 30 days maybe with a prefix
>> that refers to some event id.
>>
>> OpenTSDB is another way of doing it: http://opentsdb.net/
>>
>> J-D
>>
>> On Thu, Sep 29, 2011 at 11:04 AM, Jameson Lopp<jameson@bronto.com>
>>  wrote:
>>
>>> I wish to store a count of 30-day trailing event data (e.g. # of clicks
>>> in
>>> past 30 days) and ended up reading the documentation for setTimeRange in
>>> the
>>> Increment operation.
>>> http://hbase.apache.org/**apidocs/org/apache/hadoop/**
>>> hbase/client/Increment.html#**getTimeRange%28%29<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Increment.html#getTimeRange%28%29>
>>>
>>> I was hoping someone could clarify if it works as I'm imagining in this
>>> example scenario.
>>>
>>> 1) Current click count is 0
>>>
>>> 2) I process a click and I perform an increment operation with the time
>>> range set to minStamp = now and maxStamp = 30 days from now
>>>
>>> 3) I query for the value immediately and find it to be 1
>>>
>>> 4) Assuming no other clicks come in, if I query for the value in 31 days,
>>> it
>>> will be returned as 0
>>>
>>> In essence, I'm looking for a way to set a TTL on my increment operation.
>>> Is
>>> this how it actually works? The documentation is a bit vague and I could
>>> imagine several other scenarios.
>>> --
>>> Jameson Lopp
>>> Software Engineer
>>> Bronto Software, Inc
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message