hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jl...@streamy.com>
Subject Re: real prefix filter
Date Thu, 20 Aug 2009 19:42:41 GMT
Thanks Ryan.

Do note, this will not give you proper behavior on a Get, only a Scan. 
I don't just mean the prefix+while, I also mean using the 
WhileMatchFilter at all on a Get.

Filters are allowed in Gets because you can filter on columns and values 
as well, they don't make sense on row keys.  If you need an "early-out" 
filter with a Get, you most likely need to use a Scan instead.

This kind of confusion/inconsistency is more reason to re-implement Gets 
as optimized Scans...


Ryan Rawson wrote:
> The expected idiom is like so:
>       scanSpec.setFilter(new WhileMatchFilter(
>           new PrefixFilter(prefix)));
> This is common for most filters, rather than encoding the 'stop once
> past' type of logic, it is embedded in the while match flter and all
> others are wrapped with it where necessary.
> -ryan
> 2009/8/20 Jonathan Gray <jlist@streamy.com>:
>> It should, perhaps, stop once you pass the prefix.  I actually thought it
>> did, but you and the code say otherwise.  Doing the early-out with a Get is
>> actually not possible, so this may be why it is not implemented as such.
>> However, a Scan can take both a startRow and a stopRow.  So you can use that
>> to early-out instead.
>> Given that filters now work with Gets, you cannot actually implement the
>> early-out within the filter.  You'll have to use start/stop rows.  One could
>> argue a prefix filter may not make much sense on a Get (since you must
>> explicitly specify row), so if you'd like to raise that issue and see if we
>> could integrate an early-out in the filter, please file a JIRA.
>> JG
>> Matus Zamborsky wrote:
>>> Hello,
>>> I am scaning hbase a table with Scan and I am using PrefixFilter. As I
>>> understand, it scans the whole table and run the filter on every row. But
>>> why it does not stop after finding row without the desired prefix? If it did
>>> not find the prefix, if should return true in filterAllRemaining calling.
>>> Combining this with possible specifing the start row in Scan object, one
>>> can very fast filter only rows with the desired prefix.
>>> I am using hbase 0.20 from trunk.
>>> Regards
>>> Matus Zamborsky

View raw message