hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liam Slusser <lslus...@gmail.com>
Subject FuzzyRowFilter weird results
Date Fri, 25 Apr 2014 01:24:08 GMT
Hey All -

I'm having some strange results using FuzzyRowFilter.  I'm programming in
jython for that extra bit of adventure.

My hbase key looks something like [random 10bytes][servicetype
12bytes][timestamp 10bytes] = 32 bytes total.  For an example key
e23d4ac4b90002000100011398388474

So the following code will find the above key:

filter = FuzzyRowFilter([ Pair(array('b',
"e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"),
array('b',
[0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]))])

But I'm only able to match at the beginning of the key, never the middle or
at the end.

This will not find the above key:

filter = FuzzyRowFilter([ Pair(array('b',
"e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x004"),
array('b',
[0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0]))])

Am I doing something wrong?  Is there a better way to search for keys?
 Really I'm going to want to search on the 12-byte service-type.

Here is the full jython code:

from array import array
from org.apache.hadoop.hbase.util import Pair
from org.apache.hadoop.hbase import HBaseConfiguration
from org.apache.hadoop.hbase.client import HBaseAdmin, HTable, Scan
from org.apache.hadoop.hbase.filter import FuzzyRowFilter

conf = HBaseConfiguration()
filter = FuzzyRowFilter([ Pair(array('b',
"e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00"),
array('b',
[0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])) ])

scan = Scan()
scan.setFilter(filter)
table = HTable(conf,'mytable')
s = table.getScanner(scan)

while True:
    r = s.next()
    if not r:
        break
    else:
        print r

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message