hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liam Slusser <lslus...@gmail.com>
Subject Re: FuzzyRowFilter weird results
Date Fri, 25 Apr 2014 09:02:09 GMT
I've figured out my problem, in python, or jython as this is, you don't
need to escape the \.  So in Java \\x00 is \x00 in jython/python.  Oops!
 Basically I was adding a whole bunch of bytes for \ that shouldn't be
there, causing it to never match anything.

thank!
liam



On Thu, Apr 24, 2014 at 9:24 PM, Liam Slusser <lslusser@gmail.com> wrote:

> I'm running CDH4.6.0 with HBase 0.94.15-cdh4.6.0.  I was wondering, does
> the key need to be serialized?  Currently my keys are strings, not raw
> bytes.
>
> thanks,
> liam
>
>
> On Thu, Apr 24, 2014 at 6:28 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Which HBase version are you using ?
>>
>> Cheers
>>
>>
>> On Thu, Apr 24, 2014 at 6:24 PM, Liam Slusser <lslusser@gmail.com> wrote:
>>
>> > Hey All -
>> >
>> > I'm having some strange results using FuzzyRowFilter.  I'm programming
>> in
>> > jython for that extra bit of adventure.
>> >
>> > My hbase key looks something like [random 10bytes][servicetype
>> > 12bytes][timestamp 10bytes] = 32 bytes total.  For an example key
>> > e23d4ac4b90002000100011398388474
>> >
>> > So the following code will find the above key:
>> >
>> > filter = FuzzyRowFilter([ Pair(array('b',
>> >
>> >
>> "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"),
>> > array('b',
>> > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]))])
>> >
>> > But I'm only able to match at the beginning of the key, never the
>> middle or
>> > at the end.
>> >
>> > This will not find the above key:
>> >
>> > filter = FuzzyRowFilter([ Pair(array('b',
>> >
>> >
>> "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x004"),
>> > array('b',
>> > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0]))])
>> >
>> > Am I doing something wrong?  Is there a better way to search for keys?
>> >  Really I'm going to want to search on the 12-byte service-type.
>> >
>> > Here is the full jython code:
>> >
>> > from array import array
>> > from org.apache.hadoop.hbase.util import Pair
>> > from org.apache.hadoop.hbase import HBaseConfiguration
>> > from org.apache.hadoop.hbase.client import HBaseAdmin, HTable, Scan
>> > from org.apache.hadoop.hbase.filter import FuzzyRowFilter
>> >
>> > conf = HBaseConfiguration()
>> > filter = FuzzyRowFilter([ Pair(array('b',
>> >
>> >
>> "e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"),
>> > array('b',
>> > [0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])) ])
>> >
>> > scan = Scan()
>> > scan.setFilter(filter)
>> > table = HTable(conf,'mytable')
>> > s = table.getScanner(scan)
>> >
>> > while True:
>> >     r = s.next()
>> >     if not r:
>> >         break
>> >     else:
>> >         print r
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message