hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: FuzzyRowFilter missing keys
Date Thu, 13 Mar 2014 08:48:09 GMT
On the same tables I get missing row keys for a mask in the prefix, if I
mask the second part of the key like this:
201401\x00\x00\x00\x00\x00_product1___
and fuzzy info:
{0,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0}
It seems to work....

Anyone encounter issues with masking the prefix ? Seems odd because the
Sematext example for using FuzzyRowFilter talks about masking the prefix...



On Tue, Mar 11, 2014 at 11:07 AM, Amit Sela <amits@infolinks.com> wrote:

> I can't seem to reproduce in unit test.
> The main difference is that I'm using bulk load in the cluster and Put API
> in the unit test.
>
>
>
> On Mon, Mar 10, 2014 at 4:47 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Amit:
>> Can you put your scenario in a unit test so that it is easier to pinpoint
>> where the issue is ?
>>
>> Thanks
>>
>>
>> On Mon, Mar 10, 2014 at 5:25 AM, Amit Sela <amits@infolinks.com> wrote:
>>
>> > My table contains keys of this kind over an entire month but the scan
>> > returns only for a some fo the days.
>> > I have 2010101-20140131 but the scan returns only for:
>> > 20140104, 20140110, 20140111, 20140118, 20140120, 20140125, 20140128
>> >
>> > Using get or scan with no fuzzy filter works...
>> >
>> >
>> > On Mon, Mar 10, 2014 at 1:59 PM, Bharath Vissapragada <
>> > bharathv@cloudera.com
>> > > wrote:
>> >
>> > > Is it because you fixed "_US_product1___" part of the key?  From your
>> > > definition of filter you should get as output all keys of form
>> > > "yyyyMMdd_US_product1___".
>> > > can you share a key thats of this format and missing in the output?
>> > >
>> > >
>> > > On Mon, Mar 10, 2014 at 3:38 PM, Amit Sela <amits@infolinks.com>
>> wrote:
>> > >
>> > > > Hi all,
>> > > > I'm using HBase 0.94.12 + Hadoop 1.0.4.
>> > > > Trying to use FuzzyRowFilter looks like it's missing keys in the
>> scan.
>> > > >
>> > > > Row key structure:
>> > > > yyyyMMdd_Country_Product_Category1_Category2_
>> > > > Where the date is mandatory and all other fields may be "".
>> > > > Examples:
>> > > > 20140101_US_product1___
>> > > > 20140102__product1_bla__
>> > > > 20140103_____
>> > > >
>> > > > Supplying the filter with row key:
>> > > > \x00\x00\x00\x00\x00\x00\x00\x00_US_product1___
>> > > > and fuzzy info:
>> > > > {1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
>> > > >
>> > > > Over a range of a month, although the key exists for every day in
>> the
>> > > > month, I get result only for some of the days.
>> > > >
>> > > > I tried it on another table and the same happens, I'll mention that
>> > both
>> > > > tables have keys that start with yyyyMMdd.
>> > > >
>> > > > Anyone had a similar issue before ? I saw something in the mailing
>> list
>> > > > archives but no results there...
>> > > >
>> > > > Thanks,
>> > > > Amit.
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Bharath Vissapragada
>> > > <http://www.cloudera.com>
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message