hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13109) Make better SEEK vs SKIP decisions during scanning
Date Tue, 03 Mar 2015 07:10:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344647#comment-14344647

ramkrishna.s.vasudevan commented on HBASE-13109:

bq.ut KeyValue.KVComparator.compareOnlyKeyPortion(Cell, Cell) will not work, because I cannot
make a Cell from the seek Cell in SQM without materializing the byte[]... That's the part
I have to avoid.
I understand this as to why you have to avoid because you will not use the Cell from SQM directly
as the ts and type is the one that you will be passing. 
We tried out a way in our internal branch in such cases where we want the FirstOnRow, LastOnCol,
firstOnCol type of Kvs for which we created a new FirstOnCol cell object passing the cell
- but the getTS and getType would return LATEST_TIMESTAMP/MIN_TIMESTAMP and type as MAX/MIN
based on what we want such that it is two cells.  Anyway I think changing to cell for nextIndexKey
does not matter except that there is a new compare() API. Fine with carrying on as it is now
in your patches. Thanks Lars.

> Make better SEEK vs SKIP decisions during scanning
> --------------------------------------------------
>                 Key: HBASE-13109
>                 URL: https://issues.apache.org/jira/browse/HBASE-13109
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>         Attachments: 13109-0.98-v4.txt, 13109-trunk-v2.txt, 13109-trunk-v3.txt, 13109-trunk-v4.txt,
13109-trunk.txt, nextIndexKVChange_new.patch
> I'm re-purposing this issue to add a heuristic as to when to SEEK and when to SKIP Cells.
This has come up in various issues, and I think I have a way to finally fix this now. HBASE-9778,
HBASE-12311, and friends are related.
> --- Old description ---
> This is a continuation of HBASE-9778.
> We've seen a scenario of a very slow scan over a region using a timerange that happens
to fall after the ts of any Cell in the region.
> Turns out we spend a lot of time seeking.
> Tested with a 5 column table, and the scan is 5x faster when the timerange falls before
all Cells' ts.
> We can use the lookahead hint introduced in HBASE-9778 to do opportunistic SKIPing before
we actually seek.

This message was sent by Atlassian JIRA

View raw message