hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9440) Pass blocks of KVs from HFile scanner to the StoreFileScanner and up
Date Sat, 14 Sep 2013 04:26:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767365#comment-13767365
] 

Lars Hofhansl commented on HBASE-9440:
--------------------------------------

bq. You mean going via front door?
Sorry, yes.

bq. What should take-away be? What we need to dig in on? To go faster, we need to do the prefixtreeblocks
and pull blocks up out of hfile?

Not entirely sure... The data suggest that with 50 cols the best we can do is a ~5x improvement
(and that is if we can pass the KVs up with *no* overhead).

For tall tables, we might want to check what the per row overhead is (is it the creation of
the Result object for example?)

Yes, to go faster we need to be able to scan encoded KVs and pass the up unchanged to the
various heaps, to avoid all that baggage of the key for every column (0.94 and trunk still
do that for the prefix encoders). We need to be able to pass KVs around that are not backed
by a continuous byte[].

                
> Pass blocks of KVs from HFile scanner to the StoreFileScanner and up
> --------------------------------------------------------------------
>
>                 Key: HBASE-9440
>                 URL: https://issues.apache.org/jira/browse/HBASE-9440
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>
> Currently we read KVs from an HFileScanner one-by-one and pass them up the scanner/heap
tree. Many time the ranges of KVs retrieved from StoreFileScanner (by StoreScanners) and HFileScanner
(by StoreFileScanner) will be non-overlapping. If chunks of KVs do not overlap we can sort
entire chunks just by comparing the start/end key of the chunk. Only if chunks are overlapping
do we need to sort KV by KV as we do now.
> I have no patch, but I wanted to float this idea. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message