hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3562) ValueFilter is being evaluated before performing the column match
Date Thu, 17 Nov 2016 02:21:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672429#comment-15672429
] 

Duo Zhang commented on HBASE-3562:
----------------------------------

Do you mean we should commit the UTs in this patch?

Now in master, we will call columns.checkColumn before evaluating filter so I think the problem
described here is gone. But in general, I think we should also count versions before evaluating
filters. The current implementation(filter then count versions) may returns different results
on the same data set due to major compaction.

Think of this. You set maxVersions to 3, and there are 4 versions. Your filter will filter
out the 3 newer versions, so you will get the oldest version when doing a get or scan. And
here comes a major compaction, the oldest version is reclaimed. At this time you will get
nothing when doing the same get or scan.

We need to fix this I think although this is an 'incompatible change'.

Thanks.

> ValueFilter is being evaluated before performing the column match
> -----------------------------------------------------------------
>
>                 Key: HBASE-3562
>                 URL: https://issues.apache.org/jira/browse/HBASE-3562
>             Project: HBase
>          Issue Type: Bug
>          Components: Filters
>    Affects Versions: 0.90.0, 0.94.7
>            Reporter: Evert Arckens
>         Attachments: HBASE-3562.patch
>
>
> When performing a Get operation where a both a column is specified and a ValueFilter,
the ValueFilter is evaluated before making the column match as is indicated in the javadoc
of Get.setFilter()  : " {@link Filter#filterKeyValue(KeyValue)} is called AFTER all tests
for ttl, column match, deletes and max versions have been run. "
> The is shown in the little test below, which uses a TestComparator extending a WritableByteArrayComparable.
> public void testFilter() throws Exception {
> 	byte[] cf = Bytes.toBytes("cf");
> 	byte[] row = Bytes.toBytes("row");
> 	byte[] col1 = Bytes.toBytes("col1");
> 	byte[] col2 = Bytes.toBytes("col2");
> 	Put put = new Put(row);
> 	put.add(cf, col1, new byte[]{(byte)1});
> 	put.add(cf, col2, new byte[]{(byte)2});
> 	table.put(put);
> 	Get get = new Get(row);
> 	get.addColumn(cf, col2); // We only want to retrieve col2
> 	TestComparator testComparator = new TestComparator();
> 	Filter filter = new ValueFilter(CompareOp.EQUAL, testComparator);
> 	get.setFilter(filter);
> 	Result result = table.get(get);
> }
> public class TestComparator extends WritableByteArrayComparable {
>     /**
>      * Nullary constructor, for Writable
>      */
>     public TestComparator() {
>         super();
>     }
>     
>     @Override
>     public int compareTo(byte[] theirValue) {
>         if (theirValue[0] == (byte)1) {
>             // If the column match was done before evaluating the filter, we should never
get here.
>             throw new RuntimeException("I only expect (byte)2 in col2, not (byte)1 from
col1");
>         }
>         if (theirValue[0] == (byte)2) {
>             return 0;
>         }
>         else return 1;
>     }
> }
> When only one column should be retrieved, this can be worked around by using a SingleColumnValueFilter
instead of the ValueFilter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message