hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Lam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
Date Tue, 11 Sep 2012 15:29:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453104#comment-13453104

Jerry Lam commented on HBASE-6757:

Hi Lars:

Thanks for looking into it!
I have a little concern, not about the change but the existing code that depends on the SKIP
return code. There might be some users that have a complex FilterList (A filterlist of a filterlist
of a filterlist, etc) that might depends on the SKIP behaviour. The users might not aware
of it. What do you think?
> Very inefficient behaviour of scan using FilterList
> ---------------------------------------------------
>                 Key: HBASE-6757
>                 URL: https://issues.apache.org/jira/browse/HBASE-6757
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.6
>            Reporter: Jerry Lam
>         Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java
> The behaviour of scan is very inefficient when using with FilterList.
> The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL
is used. 
> This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates
to jump to NEXT_ROW because no further match can be found, the scan continues to scan all
versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter
has been rewritten by the FilterList from NEXT_ROW to SKIP. 
> This is particularly inefficient when there are many versions in a column because the
check is performed on all versions of the column instead of just by checking the qualifier
of the column name.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message