hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "binlijin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-16213) A new HFileBlock structure for fast random get
Date Tue, 19 Jul 2016 13:44:20 GMT

     [ https://issues.apache.org/jira/browse/HBASE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

binlijin updated HBASE-16213:
-----------------------------
    Description: 
HFileBlock store cells sequential, current when to get a row from the block, it scan from
the first cell until the row's cell.
The new structure store every row's start offset with data, so it can find the exact row with
binarySearch.

I use EncodedSeekPerformanceTest test the performance.
First use ycsb write 100w data, every row only one qualifier, and valueLength=16B/64/256B/1k.
Then use EncodedSeekPerformanceTest to test random read 1w or 100w row.

  was:
HFileBlock store cells sequential, current when to get a row from the block, it scan from
the first cell until the row's cell.
The new structure store every row's start offset with data, so it can find the exact row with
binarySearch.



> A new HFileBlock structure for fast random get
> ----------------------------------------------
>
>                 Key: HBASE-16213
>                 URL: https://issues.apache.org/jira/browse/HBASE-16213
>             Project: HBase
>          Issue Type: New Feature
>          Components: Performance
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-16213-master_v1.patch, HBASE-16213.patch, HBASE-16213_branch1_v3.patch,
HBASE-16213_v2.patch, new-hfile-block.xlsx
>
>
> HFileBlock store cells sequential, current when to get a row from the block, it scan
from the first cell until the row's cell.
> The new structure store every row's start offset with data, so it can find the exact
row with binarySearch.
> I use EncodedSeekPerformanceTest test the performance.
> First use ycsb write 100w data, every row only one qualifier, and valueLength=16B/64/256B/1k.
> Then use EncodedSeekPerformanceTest to test random read 1w or 100w row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message