hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aditya Kishore (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9747) PrefixFilter with OR condition gives wrong results
Date Fri, 11 Oct 2013 23:53:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793144#comment-13793144
] 

Aditya Kishore commented on HBASE-9747:
---------------------------------------

Actually, I am surprised by the third scan result

{quote}
hbase(main):002:0> scan 't1',
\{FILTER => "SingleColumnValueFilter('f1', 'q1', =, 'binary:113')"}

ROW COLUMN+CELL
c1 column=f1:q1, timestamp=1381469178679, value=113
1 row(s) in 0.0140 seconds
{quote}

This should have returned two rows 
{noformat}
a1 column=f1:q2, timestamp=1381468905492, value=111
c1 column=f1:q1, timestamp=1381468905549, value=113
{noformat}

The {{SingleColumnValueFilter}}, by default does not filter out the rows in which the specified
column does not exist ('a1', in your case). So it will let this row returned for the scan.
For the same scan I get this result.

{noformat}
hbase(main):010:0> scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', =, 'binary:113')"}
ROW                                                          COLUMN+CELL
 a1                                                          column=f1:q2, timestamp=1381528316466,
value=111
 c1                                                          column=f1:q1, timestamp=1381528324693,
value=113
2 row(s) in 0.0210 seconds
{noformat}

If you want to drop the rows which does not include the column, you need to call {{SingleColumnValueFilter.setFilterIfMissing(true)}},
from the shell you can invoke it this way.
{noformat}
scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', =, 'binary:113', true, false)"}
{noformat}


> PrefixFilter with OR condition gives wrong results
> --------------------------------------------------
>
>                 Key: HBASE-9747
>                 URL: https://issues.apache.org/jira/browse/HBASE-9747
>             Project: HBase
>          Issue Type: Bug
>          Components: Filters
>    Affects Versions: 0.94.9
>            Reporter: Deepa Remesh
>
> PrefixFilter when used with a SingleColumnValueFilter with an OR condition gives wrong
results. In below example, each filter when evaluated separately gives 1 row each. The OR
condition with the two filters gives 3 rows instead of 2. Repro below:
> create 't1', 'f1'
> put 't1','a1','f1:q2','111'
> put 't1','b1','f1:q1','112'
> put 't1','c1','f1:q1','113'
> hbase(main):020:0> scan 't1', {FILTER => "PrefixFilter ('b') OR SingleColumnValueFilter('f1',
'q1', =, 'binary:113')"}
> ROW                                                COLUMN+CELL
>  a1                                                column=f1:q2, timestamp=1381468905492,
value=111
>  b1                                                column=f1:q1, timestamp=1381468905518,
value=112
>  c1                                                column=f1:q1, timestamp=1381468905549,
value=113
> 3 row(s) in 0.1020 seconds
> hbase(main):021:0> scan 't1', {FILTER => "PrefixFilter ('b')"}
> ROW                                                COLUMN+CELL
>  b1                                                column=f1:q1, timestamp=1381468905518,
value=112
> 1 row(s) in 0.0150 seconds
> hbase(main):002:0> scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'q1', =,
'binary:113')"}
> ROW                                                COLUMN+CELL
>  c1                                                column=f1:q1, timestamp=1381469178679,
value=113
> 1 row(s) in 0.0140 seconds



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message