hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter
Date Wed, 12 Oct 2011 21:29:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126172#comment-13126172
] 

jiraposter@reviews.apache.org commented on HBASE-4469:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2235/#review2541
-----------------------------------------------------------

Ship it!


Patch looks good.  Small.  Only works if bloom filters are already on?

- Michael


On 2011-10-06 17:17:23, Liyin wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2235/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 17:17:23)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  The problem is that when seeking for the row/col in the hfile, we will go to top of the
row in order to check for row delete marker (delete family).
bq.  However, if the bloomfilter is enabled for the column family, then if a delete family
operation is done on a row, the row is already being added to bloomfilter.
bq.  We can take advantage of this factor to avoid seeking to the top of row.
bq.  
bq.  Also, Update the TestBlocksRead unit tests. since most of block read count has dropped
to a lower number.
bq.  
bq.  Evaluation:
bq.  In TestSeekingOptimization, it saved 31.6% seek operation perviously.
bq.  Now it saves about 41.82% seek operation.
bq.  10% more seek operation.
bq.  
bq.  ======================
bq.  Before this diff:
bq.  For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with optimization:
1714 (68.40%), savings: 31.60%
bq.  
bq.  =====================
bq.  Apply this diff:
bq.  For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with optimization:
1458 (58.18%), savings: 41.82%
bq.  =====================
bq.  
bq.  Thanks Mikhail and Kannan's help and discussion.
bq.  
bq.  
bq.  This addresses bug HBASE-4469.
bq.      https://issues.apache.org/jira/browse/HBASE-4469
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 7b0b9e6 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 8dd8a68 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java abccea4 
bq.  
bq.  Diff: https://reviews.apache.org/r/2235/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Run all the unit tests.
bq.  There are 2 unit tests failed with and without my change.
bq.  TestDistributedLogSplitting
bq.  TestHTablePool
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Liyin
bq.  
bq.


                
> Avoid top row seek by looking up bloomfilter
> --------------------------------------------
>
>                 Key: HBASE-4469
>                 URL: https://issues.apache.org/jira/browse/HBASE-4469
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> The problem is that when seeking for the row/col in the hfile, we will go to top of the
row in order to check for row delete marker (delete family). However, if the bloomfilter is
enabled for the column family, then if a delete family operation is done on a row, the row
is already being added to bloomfilter. We can take advantage of this factor to avoid seeking
to the top of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message