hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jsichi (John Sichi) (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter
Date Tue, 18 Oct 2011 23:41:13 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

jsichi (John Sichi) updated HBASE-4532:
---------------------------------------

    Attachment: D27.1.patch

Liyin requested code review of "[jira] [HBASE-4532] Avoid top row seek by dedicated bloom
filter for delete family bloom filter".
Reviewers: DUMMY_REVIEWER

  hbase 4532 89

  <a href="https://issues.apache.org/jira/browse/HBASE-4469" title="Avoid top row seek
by looking up bloomfilter"><del>HBASE-4469</del></a> avoids the top row
seek operation if row-col bloom filter is enabled.
  This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter
only for delete family

  The only subtle use case is when we are interested in the top row with empty column.

  For example,
  we are interested in row1/cf1:/1/put.
  So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter
will say there is NO delete family.
  Then it will avoid the top row seek and return a fake kv, which is the last kv for this
row (createLastOnRowCol).
  In this way, we have already missed the real kv we are interested in.

  The solution for the above problem is to disable this optimization if we are trying to GET/SCAN
a row with empty column.

TEST PLAN
  EMPTY

REVISION DETAIL
  http://reviews.facebook.net/D27

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

MANAGE HERALD DIFFERENTIAL RULES
  http://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  http://reviews.facebook.net/herald/transcript/57/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-4532
>                 URL: https://issues.apache.org/jira/browse/HBASE-4532
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: D27.1.patch, D27.1.patch
>
>
> HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. 
> This jira tries to avoid top row seek for all the cases by creating a dedicated bloom
filter only for delete family
> The only subtle use case is when we are interested in the top row with empty column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter
will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last kv for this
row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are trying to
GET/SCAN a row with empty column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message