hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kannan Muthukkaruppan (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-4823) long running scans lose benefit of bloomfilters and timerange hints
Date Sat, 19 Nov 2011 02:26:51 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Kannan Muthukkaruppan updated HBASE-4823:

    Summary: long running scans lose benefit of bloomfilters and timerange hints  (was: long
running scan lose benefit of bloomfilters and timerange hints)
> long running scans lose benefit of bloomfilters and timerange hints
> -------------------------------------------------------------------
>                 Key: HBASE-4823
>                 URL: https://issues.apache.org/jira/browse/HBASE-4823
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
> When you have a long running scan due to say an MR job, you can lose the benefit of timerange
hints & bloom filters midway if your scanner gets reset. [Note: The scanners can get reset
say due to a flush or compaction].
> In one of our workloads, we periodically want to do rollups on recent 15 minutes of data
in a column family... but the timerange hint benefit is lost midway when this resetScannerStack
(shown below) happens. And end result-- we end up reading all the old HFiles rather than just
the recent HFiles.
> {code}
>  private void resetScannerStack(KeyValue lastTopKey) throws IOException {
>     if (heap != null) {
>       throw new RuntimeException("StoreScanner.reseek run on an existing heap!");
>     }
>     /* When we have the scan object, should we not pass it to getScanners()
>      * to get a limited set of scanners? We did so in the constructor and we
>      * could have done it now by storing the scan object from the constructor */
>     List<KeyValueScanner> scanners = getScanners();
> {code}
> The comment in the code seems to be aware of this issue and even has the suggested fix!

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message