hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-3745) Add the ability to restrict major-compactible files by timestamp
Date Fri, 14 Sep 2012 00:26:07 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Lars Hofhansl resolved HBASE-3745.

    Resolution: Duplicate

Let me mark this as DUP of HBASE-6371
> Add the ability to restrict major-compactible files by timestamp
> ----------------------------------------------------------------
>                 Key: HBASE-3745
>                 URL: https://issues.apache.org/jira/browse/HBASE-3745
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
> In some applications, a common access pattern is to frequently scan tables with a time
range predicate restricted to a fairly recent time window. For example, you may want to do
an incremental aggregation or indexing step only on rows that have changed in the last hour.
We do this efficiently by tracking min and max timestamp on an HFile level, so that old HFiles
don't have to be read.
> After a major compaction, however, the entire dataset will need to be read, which can
hurt performance of this access pattern.
> We should add a column family attribute that can specify a policy like: When major compacting,
never include an HFile that contains data with a timestamp in the last 4 hours. This, recently
flushed HFiles will always be uncompacted and provide the good scan performance required for
these applications.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message