hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anil Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6509) Implement fast-forwarding FuzzyRowFilter to allow filtering rows e.g. by "???alex?b"
Date Fri, 17 Aug 2012 17:37:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436901#comment-13436901
] 

Anil Gupta commented on HBASE-6509:
-----------------------------------

Hi Alex,

I have a question related to this filter. I have a similar filtering requirement which will
be an extension to FuzzyFilterRow.
Suppose, i have the following structure of rowkeys: userid_actionid, where userid is of 6
digit and then actionid is 5 digit. I would like to get all the rows with actionid between
00200 to 00350. With current FuzzyRowFilter i can search for all the rows a particular actionid.
Instead of searching for a particular actionid i would like to search for a range of actionid.


Does this use case sounds like an extension to current FuzzyRowFilter? Can i run this kind
of filter on HBase0.92 without doing any significant update to the cluster. If i develop this
kind of filter then what is needed to run it on all the RS's?

Thanks,
Anil 
                
> Implement fast-forwarding FuzzyRowFilter to allow filtering rows e.g. by "???alex?b"
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-6509
>                 URL: https://issues.apache.org/jira/browse/HBASE-6509
>             Project: HBase
>          Issue Type: New Feature
>          Components: filters
>            Reporter: Alex Baranau
>            Assignee: Alex Baranau
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6509_1.patch, HBASE-6509_2.patch, HBASE-6509_3.patch, HBASE-6509.patch
>
>
> Implement fuzzy row key filter to allow fetching records e.g. by this criteria: "???alex?b".
> This seems to be very useful as an alternative to select records by row keys by specifying
their part which is not prefix part. Due to fast-forwarding nature of the filter in many situations
this helps to avoid heavy full-table scans.
> This is especially effective when you have composite row key and (some of) its parts
has fixed length. E.g. with the key of format userId_actionId_time, given that userId and
actionId length is fixed, one can select user actions of specific type using fuzzy row key
by specifying mask "????_myaction". Given fast-forwarding nature of filter, this will usually
work much faster than doing whole table scan with any of the existing server-side filters.
> In many cases this can work as secondary-indexing alternative.
> Many times users implement it as a custom filter and many times they just don' know this
is possible. Let's add it to the common codebase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message