hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13109) Make better SEEK vs SKIP decisions during scanning
Date Wed, 11 Mar 2015 22:12:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357703#comment-14357703

Andrew Purtell commented on HBASE-13109:

The commit of this to 0.98 branch breaks Phoenix compilation if using -Dhbase.version=0.98.12-SNAPSHOT
(after installing latest 0.98 into the local Maven cache):
[ERROR] /Users/apurtell/src/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReader.java:[141,35]
<anonymous org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReader$1> is not abstract
and does not override abstract method getNextIndexedKey() in org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /Users/apurtell/src/phoenix/phoenix-core/src/main/java/org/apache/phoenix/hbase/index/scanner/FilteredKeyValueScanner.java:[37,8]
org.apache.phoenix.hbase.index.scanner.FilteredKeyValueScanner is not abstract and does not
override abstract method getNextIndexedKey() in org.apache.hadoop.hbase.regionserver.KeyValueScanner

KeyValueScanner and HFileScanner are both marked as InterfaceAudience.Private. What should
we do here?


> Make better SEEK vs SKIP decisions during scanning
> --------------------------------------------------
>                 Key: HBASE-13109
>                 URL: https://issues.apache.org/jira/browse/HBASE-13109
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.12
>         Attachments: 13109-0.98-v4.txt, 13109-trunk-v2.txt, 13109-trunk-v3.txt, 13109-trunk-v4.txt,
13109-trunk-v5.txt, 13109-trunk.txt, nextIndexKVChange_new.patch
> I'm re-purposing this issue to add a heuristic as to when to SEEK and when to SKIP Cells.
This has come up in various issues, and I think I have a way to finally fix this now. HBASE-9778,
HBASE-12311, and friends are related.
> --- Old description ---
> This is a continuation of HBASE-9778.
> We've seen a scenario of a very slow scan over a region using a timerange that happens
to fall after the ts of any Cell in the region.
> Turns out we spend a lot of time seeking.
> Tested with a 5 column table, and the scan is 5x faster when the timerange falls before
all Cells' ts.
> We can use the lookahead hint introduced in HBASE-9778 to do opportunistic SKIPing before
we actually seek.

This message was sent by Atlassian JIRA

View raw message