hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16213) A new HFileBlock structure for fast random get
Date Tue, 19 Jul 2016 05:59:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383619#comment-15383619

stack commented on HBASE-16213:

Nice. Seek in the row when random reading is one of the main consumers of CPU.

Why bother having two encoders? Why not just one that does row and column family index?

Any idea on how much more work we are doing when this is enabled (CPU?). Is it less with this
feature on or more? Under what circumstances do you think?

Let me try this.  Meantime here are some comments on the patch:

In class comment, either in encoder or decoder, describe how the encoding works, what layout
looks like with some advice on when to use it. Can then copy paste as the release note on
this issue.



Could the above return a length so you don't have to reget it on the next line with:

    int size = KeyValueUtil.length(cell);

The length parse costs.

Anywhere that you can get count of how many kvs in block that you can use here:

      List<ByteBuffer> kvs = new ArrayList<ByteBuffer>();

Remove these...

    // TODO Auto-generated method stub

Put these together?

102	      LOG.trace("RowNumber: " + rowsOffset.size());
103	      LOG.trace("onDiskSize: " + onDiskSize);

One line is easier to read than two...

Got half way through... will be back w/ more. Nice.

> A new HFileBlock structure for fast random get
> ----------------------------------------------
>                 Key: HBASE-16213
>                 URL: https://issues.apache.org/jira/browse/HBASE-16213
>             Project: HBase
>          Issue Type: New Feature
>          Components: Performance
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-16213-master_v1.patch, HBASE-16213.patch, HBASE-16213_v2.patch
> HFileBlock store cells sequential, current when to get a row from the block, it scan
from the first cell until the row's cell.
> The new structure store every row's start offset with data, so it can find the exact
row with binarySearch.

This message was sent by Atlassian JIRA

View raw message