hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10850) essential column family optimization is broken
Date Thu, 03 Apr 2014 10:31:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958698#comment-13958698
] 

Hadoop QA commented on HBASE-10850:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12638443/10850-v7.txt
  against trunk revision .
  ATTACHMENT ID: 12638443

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified
tests.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     

     {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 	at org.apache.hadoop.hbase.mapreduce.TestTableMapReduceBase.testMultiRegionTable(TestTableMapReduceBase.java:96)

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9188//console

This message is automatically generated.

> essential column family optimization is broken
> ----------------------------------------------
>
>                 Key: HBASE-10850
>                 URL: https://issues.apache.org/jira/browse/HBASE-10850
>             Project: HBase
>          Issue Type: Bug
>          Components: Coprocessors, Filters, Performance
>    Affects Versions: 0.96.1.1
>            Reporter: Fabien Le Gallo
>            Assignee: Ted Yu
>            Priority: Blocker
>             Fix For: 0.98.1, 0.99.0, 0.96.3
>
>         Attachments: 10850-hasFilterRow-v1.txt, 10850-hasFilterRow-v2.txt, 10850-hasFilterRow-v3.txt,
10850-v4.txt, 10850-v5.txt, 10850-v6.txt, 10850-v7.txt, HBASE-10850-96.patch, HBASE-10850.patch,
HBASE-10850_V2.patch, HBaseSingleColumnValueFilterTest.java, TestWithMiniCluster.java
>
>
> When using the filter SingleColumnValueFilter, and depending of the columns specified
in the scan (filtering column always specified), the results can be different.
> Here is an example.
> Suppose the following table:
> ||key||a:foo||a:bar||b:foo||b:bar||
> |1|false|_flag_|_flag_|_flag_|
> |2|true|_flag_|_flag_|_flag_|
> |3| |_flag_|_flag_|_flag_|
> With this filter:
> {code}
> SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes("a"), Bytes.toBytes("foo"),
CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("false")));
> filter.setFilterIfMissing(true);
> {code}
> Depending of how I specify the list of columns to add in the scan, the result is different.
Yet, all examples below should always return only the first row (key '1'):
> OK:
> {code}
> scan.addFamily(Bytes.toBytes("a"));
> {code}
> KO (2 results returned, row '3' without 'a:foo' qualifier is returned):
> {code}
> scan.addFamily(Bytes.toBytes("a"));
> scan.addFamily(Bytes.toBytes("b"));
> {code}
> KO (2 results returned, row '3' without 'a:foo' qualifier is returned):
> {code}
> scan.addColumn(Bytes.toBytes("a"), Bytes.toBytes("foo"));
> scan.addColumn(Bytes.toBytes("a"), Bytes.toBytes("bar"));
> scan.addColumn(Bytes.toBytes("b"), Bytes.toBytes("foo"));
> {code}
> OK:
> {code}
> scan.addColumn(Bytes.toBytes("a"), Bytes.toBytes("foo"));
> scan.addColumn(Bytes.toBytes("b"), Bytes.toBytes("bar"));
> {code}
> OK:
> {code}
> scan.addColumn(Bytes.toBytes("a"), Bytes.toBytes("foo"));
> scan.addColumn(Bytes.toBytes("a"), Bytes.toBytes("bar"));
> {code}
> This is a regression as it was working properly on HBase 0.92.
> You will find in attachement the unit tests reproducing the issue.
> +The analysis of this issue lead us to 2 critical bugs induced in 96 and above versions+
> 1. The essential family optimization is broken in some cases.  In case of condition on
some families, we 1st will read those KVs and apply condition on those, when the condition
says to filter out that row, we will not go ahead and fetch data from remaining non essential
CFs. But now in most of the cases we will do this unwanted data read which is fully against
this optimization
> 2. We have a CP hook postFilterRow() which will be called when a row is getting filtered
out by the Filter.  This gives the CP to do a reseek to the next known row which it thinks
can evaluate the condition to true. But currently in 96+ code , this hook is not getting called.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message