hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7755) Experiment with LAB in BlockEndcoding
Date Tue, 05 Feb 2013 04:26:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571029#comment-13571029

Lars Hofhansl commented on HBASE-7755:

After thinking about this a bit more... It does not make sense to the chunk size larger than
an individual block's size, and thus it makes to create a new LAB when a block is scanned,
and to do that for each block. The LAB is almost free allocate; and since this is done only
when we seek+read (i.e. scanning) and only when block encoding is enabled we'll be copying
a lot of bytes during the decoding.

>From that viewpoint BufferedEncodedSeeker is in fact the right place. The only part missing
is the config option to disable this (which turns out to be a bit more tricky to do nicely
- without passing information down a 10-depth call stack).
> Experiment with LAB in BlockEndcoding
> -------------------------------------
>                 Key: HBASE-7755
>                 URL: https://issues.apache.org/jira/browse/HBASE-7755
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>             Fix For: 0.94.6
>         Attachments: 7755-0.94-W_I_P_v1.txt, 7755-0.94-WORK_IN_PROGRESS.txt
> I was looking at and profiling the BlockEncoding code to figure out how to make it faster.
One issue that jumped out was we call ByteBuffer.allocate(...) for each single KV.
> As an experiment I tried using the MemStoreLAB code to allocate those buffers.
> Here are some preliminary numbers, all scanning 10m rows (all in cache):
> * no encoding: 5.2s
> * FAST_DIFF without patch: 7.3s
> * FAST_DIFF with patch and small LAB: 4.1s
> * FAST_DIFF with patch and large LAB: 11s
> So this is very sensitive to the right sizing of the LAB.
> Need to do a bit more testing, but it seems that there is a chance to actually make scanning
with block encoding faster than without!

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message