hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: question about merge-join (or AND operator betwween colums)
Date Sat, 08 Jan 2011 19:36:37 GMT
Sounds like you need to write a little filter Jack, one that filters
all that does not have values from all query columns.  Maybe you can
manhandle SkipFilter into doing the job?
http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/hbase/filter/SkipFilter.html

St.Ack

On Sat, Jan 8, 2011 at 10:30 AM, Jack Levin <magnito@gmail.com> wrote:
> Sorry, my mistake, right now its only OR, and we really need AND.
> I would think that with bloomfilters this could be a sweet feature to
> produce if its not there.
>
>
> -Jack
>
> On Fri, Jan 7, 2011 at 10:50 PM, Phil Whelan <phil123@gmail.com> wrote:
>> Hi Jack,
>>
>> I'm just trying follow the logic and I'm a bit confused.
>>
>>> Note that  ['generic', 'photo'], utilizes 'OR' operator, and not
>>> 'AND'.   Is it possible to create a scanner that will not AND and not
>>> OR?, in which case something like this:
>>
>> Am I right in thinking you meant "AND and not OR" instead of "not AND
>> and not OR"?
>>
>> Thanks,
>> Phil
>>
>> On Fri, Jan 7, 2011 at 8:01 PM, Jack Levin <magnito@gmail.com> wrote:
>>> Hello all, I have a scanner question, we have this table:
>>>
>>> hbase(main):002:0> scan 'mattest'
>>> ROW                                          COLUMN+CELL
>>>  1                                           column=generic:,
>>> timestamp=1294454057618, value=1
>>>  1                                           column=photo:,
>>> timestamp=1294453830339, value=1
>>>  1                                           column=type:,
>>> timestamp=1294453812716, value=photo
>>>  1                                           column=type:photo,
>>> timestamp=1294453884174, value=photo
>>>  2                                           column=generic:,
>>> timestamp=1294454061156, value=1
>>>  2                                           column=type:,
>>> timestamp=1294453851757, value=video
>>>  2                                           column=type:video,
>>> timestamp=1294453877719, value=video
>>>  2                                           column=video:,
>>> timestamp=1294453842722, value=1
>>>
>>> We need to run this query:
>>>
>>> hbase(main):004:0> scan 'mattest', {COLUMNS => ['generic', 'photo']}
>>> ROW                                          COLUMN+CELL
>>>  1                                           column=generic:,
>>> timestamp=1294454057618, value=1
>>>  1                                           column=photo:,
>>> timestamp=1294453830339, value=1
>>>  2                                           column=generic:,
>>> timestamp=1294454061156, value=1
>>>
>>> Note that  ['generic', 'photo'], utilizes 'OR' operator, and not
>>> 'AND'.   Is it possible to create a scanner that will not AND and not
>>> OR?, in which case something like this:
>>>
>>> scan 'mattest', {COLUMNS => ['generic' AND 'photo']}
>>> ROW                                          COLUMN+CELL
>>>  1                                           column=generic:,
>>> timestamp=1294454057618, value=1
>>>  1                                           column=photo:,
>>> timestamp=1294453830339, value=1
>>>
>>> Thanks in advance.
>>>
>>> -Jack
>>>
>>
>

Mime
View raw message