hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16213) A new HFileBlock structure for fast random get
Date Thu, 29 Dec 2016 09:42:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15784969#comment-15784969

Phil Yang commented on HBASE-16213:

Hi [~aoxiang] 
Thanks for your important work.

Does V2 have any weakness comparing with V1? According to their formats it seems that V2 only
has advantage? :) 1.4 or 2.0 have not been released so can we just improve this structure
rather than create several kinds of structures? 

And with the idea of V2 we can also save qualifier only once for versions of cell, right?

> A new HFileBlock structure for fast random get
> ----------------------------------------------
>                 Key: HBASE-16213
>                 URL: https://issues.apache.org/jira/browse/HBASE-16213
>             Project: HBase
>          Issue Type: New Feature
>          Components: Performance
>            Reporter: binlijin
>            Assignee: binlijin
>             Fix For: 2.0.0, 1.4.0
>         Attachments: HBASE-16213-master_v1.patch, HBASE-16213-master_v3.patch, HBASE-16213-master_v4.patch,
HBASE-16213-master_v5.patch, HBASE-16213-master_v6.patch, HBASE-16213.branch-1.v1.patch, HBASE-16213.branch-1.v4.patch,
HBASE-16213.branch-1.v4.patch, HBASE-16213.patch, HBASE-16213_branch1_v3.patch, HBASE-16213_v2.patch,
hfile-cpu.png, hfile_block_performance.pptx, hfile_block_performance2.pptx, hfile_block_performance_E2E.pptx
> HFileBlock store cells sequential, current when to get a row from the block, it scan
from the first cell until the row's cell.
> The new structure store every row's start offset with data, so it can find the exact
row with binarySearch.
> I use EncodedSeekPerformanceTest test the performance.
> First use ycsb write 100w data, every row have only one qualifier, and valueLength=16B/64/256B/1k.
> Then use EncodedSeekPerformanceTest to test random read 1w or 100w row, and also record
HFileBlock's dataSize/dataWithMetaSize in the encoding.

This message was sent by Atlassian JIRA

View raw message