hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3162) Add TimeRange support into Increment to optimize for counters that are partitioned on time
Date Sun, 31 Oct 2010 21:06:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926779#action_12926779

HBase Review Board commented on HBASE-3162:

Message from: "Jonathan Gray" <jgray@apache.org>

bq.  On 2010-10-31 13:32:44, khemani wrote:
bq.  > The timestamp that we put in the column-qualifier to create hourly counters need
not be in sync with the KV timestamp. This is because there are times when the log stream
falls behind and we might be updating couple of hours old counters. The time-range that we
provide has to be dynamically determined based on the current log-stream delay.
bq.  > 
bq.  > 
bq.  > This will really work well if along with hourly counters we also have hourly store
files. If everything gets compacted into a single store file then this change doesn't help
bq.  >

Yeah, if you aren't doing all of your increments at the same time as the stamps they represent,
you'll need to modify the TimeRange.

Something like:  [min,max) -> [minStampInPartition,Long.MAX_VALUE) where minStampInPartition
is the lowest timestamp possible for the time bucket you are incrementing.

As we begin to grow a large amount of historical data, it will be important that our compaction
policy eventually just archives old data and it does not get included in further compactions.
 This TimeRange functionality will ensure they don't impact performance on new data.

- Jonathan

This is an automatically generated e-mail. To reply, visit:

> Add TimeRange support into Increment to optimize for counters that are partitioned on
> ------------------------------------------------------------------------------------------
>                 Key: HBASE-3162
>                 URL: https://issues.apache.org/jira/browse/HBASE-3162
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Minor
>             Fix For: 0.90.0
>         Attachments: HBASE-3162-v1.patch
> In many use cases of increments, a given counter is only incremented during a specific
window of time (ie. the counters are partitioned/sharded by time).
> With this kind of schema, you are constantly creating new counters.  When a new counter
is "created" (incremented the first time) you will always end up looking at a block from every
file in the region because no previous value will exist.  However, with the new TimeRange
optimizations that skip files if they don't contain values of the TimeRange you're interested
in, we could utilize that information to optimize the Get within the increment.
> This would be optional and an addition to the Increment class.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message