hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-11811) Use binary search for seeking into a block
Date Sat, 23 Aug 2014 07:05:11 GMT

     [ https://issues.apache.org/jira/browse/HBASE-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Lars Hofhansl updated HBASE-11811:

    Attachment: block_index-v2.txt

Here's sample patch that reduce the time for 1m gets from 40s to 8.5s and there is probably
more room for optimization.
The data was simply generated by HBaseTestingUtility.loadTable(...) so the KVs are small.

Some points:
# the utility of this decreases as Cell get larger and only a few of them fit into a block
# the index is not persisted, so when blocks are evicted and later reloaded the index needs
to be build up again
# not happy currently about the point where the index is built, as that needs to synchronize
on the block (but only when the block actually had to be loaded)

> Use binary search for seeking into a block
> ------------------------------------------
>                 Key: HBASE-11811
>                 URL: https://issues.apache.org/jira/browse/HBASE-11811
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>         Attachments: block_index-v2.txt
> Currently upon every seek (including Gets) we need to linearly look through the block
from the beginning until we find the Cell we are looking for.
> It should be possible to build a simple cache of offsets of Cells for each block as it
is loaded and then use binary search to find the Cell in question.

This message was sent by Atlassian JIRA

View raw message