hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2794) ROWCOL bloom filter not used if multiple columns within same family are requested in a Get
Date Wed, 14 Jul 2010 01:15:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888142#action_12888142
] 

Nicolas Spiegelberg commented on HBASE-2794:
--------------------------------------------

Talked with Kris about setting proper exit conditions.

#1 : Exit if our error.rate > 10%.  This is an arbitrary number.  Could easily make this
configurable if someone needs it
#2 : Exit if it would take > 1ms to run the bloom check.  This ensures that blooms are
beneficial for performance even if they aren't needed 90% of the time

I wonder if it would be good to give the user an option of not running a bloom check if only
1 HFile in the StoreFile, but that's for another JIRA.

> ROWCOL bloom filter not used if multiple columns within same family are requested in
a Get
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2794
>                 URL: https://issues.apache.org/jira/browse/HBASE-2794
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
> {code}
>         switch(bloomFilterType) {
>           case ROW:
>             key = row;
>             break;
>           case ROWCOL:
>             if (columns.size() == 1) {
>               byte[] col = columns.first();
>               key = Bytes.add(row, col);
>               break;
>             }
>             //$FALL-THROUGH$
>           default:
>             return true;
>         }
> {code}
> If columns.size > 1, then we currently don't take advantage of the bloom filter. 
We should optimize this to check bloom for each of columns and if none of the columns are
present in the bloom avoid opening the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message