kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3634) When filter column has null value may cause incorrect query result
Date Fri, 19 Oct 2018 10:13:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656571#comment-16656571
] 

ASF GitHub Bot commented on KYLIN-3634:
---------------------------------------

shaofengshi closed pull request #292: KYLIN-3634 when filter column has null value may cause
incorrect query result
URL: https://github.com/apache/kylin/pull/292
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/core-cube/src/main/java/org/apache/kylin/gridtable/GTUtil.java b/core-cube/src/main/java/org/apache/kylin/gridtable/GTUtil.java
index f03329b2f8..298225f166 100644
--- a/core-cube/src/main/java/org/apache/kylin/gridtable/GTUtil.java
+++ b/core-cube/src/main/java/org/apache/kylin/gridtable/GTUtil.java
@@ -297,7 +297,7 @@ protected TupleFilter encodeConstants(CompareTupleFilter oldCompareFilter)
{
             case NEQ:
                 code = translate(col, firstValue, 0);
                 if (code == null) {
-                    result = ConstantTupleFilter.TRUE;
+                    result = newCompareFilter(TupleFilter.FilterOperatorEnum.ISNOTNULL, externalCol);
                 } else {
                     newCompareFilter.addChild(new ConstantTupleFilter(code));
                     result = newCompareFilter;
@@ -308,7 +308,7 @@ protected TupleFilter encodeConstants(CompareTupleFilter oldCompareFilter)
{
                 if (code == null) {
                     code = translate(col, firstValue, -1);
                     if (code == null)
-                        result = ConstantTupleFilter.FALSE;
+                        result = newCompareFilter(TupleFilter.FilterOperatorEnum.ISNOTNULL,
externalCol);
                     else
                         result = newCompareFilter(FilterOperatorEnum.LTE, externalCol, code);
                 } else {
@@ -330,7 +330,7 @@ protected TupleFilter encodeConstants(CompareTupleFilter oldCompareFilter)
{
                 if (code == null) {
                     code = translate(col, firstValue, 1);
                     if (code == null)
-                        result = ConstantTupleFilter.FALSE;
+                        result = newCompareFilter(TupleFilter.FilterOperatorEnum.ISNOTNULL,
externalCol);
                     else
                         result = newCompareFilter(FilterOperatorEnum.GTE, externalCol, code);
                 } else {
@@ -360,6 +360,12 @@ private TupleFilter newCompareFilter(FilterOperatorEnum op, TblColRef
col, ByteA
             return r;
         }
 
+        private TupleFilter newCompareFilter(TupleFilter.FilterOperatorEnum op, TblColRef
col) {
+            CompareTupleFilter r = new CompareTupleFilter(op);
+            r.addChild(new ColumnTupleFilter(col));
+            return r;
+        }
+
         transient ByteBuffer buf;
 
         protected ByteArray translate(int col, Object value, int roundingFlag) {
diff --git a/core-storage/src/test/java/org/apache/kylin/storage/gtrecord/DictGridTableTest.java
b/core-storage/src/test/java/org/apache/kylin/storage/gtrecord/DictGridTableTest.java
index b8de556520..d80df7870b 100644
--- a/core-storage/src/test/java/org/apache/kylin/storage/gtrecord/DictGridTableTest.java
+++ b/core-storage/src/test/java/org/apache/kylin/storage/gtrecord/DictGridTableTest.java
@@ -415,7 +415,9 @@ public void verifyConvertFilterConstants2() {
         {
             LogicalTupleFilter filter = and(fComp1, compare(extColB, FilterOperatorEnum.LT,
"9"));
             TupleFilter newFilter = GTUtil.convertFilterColumnsAndConstants(filter, info,
colMapping, null);
-            assertEquals(ConstantTupleFilter.FALSE, newFilter);
+            assertEquals(
+                    "AND [UNKNOWN_MODEL:NULL.GT_MOCKUP_TABLE.0 GT [\\x00\\x00\\x01J\\xE5\\xBD\\x5C\\x00],
UNKNOWN_MODEL:NULL.GT_MOCKUP_TABLE.1 ISNOTNULL []]",
+                    newFilter.toString());
         }
 
         // $1<"10" needs no rounding
@@ -480,7 +482,8 @@ public void verifyConvertFilterConstants3() {
         {
             LogicalTupleFilter filter = and(fComp1, compare(extColB, FilterOperatorEnum.GT,
"101"));
             TupleFilter newFilter = GTUtil.convertFilterColumnsAndConstants(filter, info,
colMapping, null);
-            assertEquals(ConstantTupleFilter.FALSE, newFilter);
+            assertEquals("AND [UNKNOWN_MODEL:NULL.GT_MOCKUP_TABLE.0 GT [\\x00\\x00\\x01J\\xE5\\xBD\\x5C\\x00],
UNKNOWN_MODEL:NULL.GT_MOCKUP_TABLE.1 ISNOTNULL []]",
+                    newFilter.toString());
         }
 
         // $1>"100" needs no rounding


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> When filter column has null value may cause incorrect query result
> ------------------------------------------------------------------
>
>                 Key: KYLIN-3634
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3634
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v2.0.0
>            Reporter: WangBo
>            Assignee: WangBo
>            Priority: Major
>             Fix For: v2.4.2, v2.5.1
>
>         Attachments: 0001-KYLIN-3634-when-filter-column-has-null-value-may-cau.patch
>
>
> h1. Question
> when a column has null value,and using it as a filter column when querying, and the
filter value is not exist in the table,this may cause incorrect result
> h1. An Example
> h2. Table A
> the table A has three rows,city column of one row has null value
>  
> ||day||...||city||price||
> |20180101| |null|10|
> |20180101| |beijing|20|
> |20180101| |shanghai|10|
> h2. Query SQL
> select day,sum(price) from a where city <> 'abc' group by day
> h2. Correct Result
> exclude the row contains null city value
> ||day||col||
> |20180101|30|
> h2. InCorrect Result
> resullt 0 rows
> this happens in our production environment,the kylin version is 2.0.0
> h1. Analysis process
> 1,city column dosen't have a value,so the CompareTupleFilter will turn into  ConstantTupleFilter(see
GTUtil.java)
> 2,if dimensions in the sql dosen't match all the columns using in group by,the  bytesComparator
used in hbase aggregation map will only compare the columns using in group by
> 3,when GTAggregateScanner constructs key of aggBufMap,the key may contains null value,because
the comparator of aggBufMap only compares group by columns,so the tuple share same group by
columns may also share the same keys which contains null value;This may cause kylin server
receives tuples contains null value;
> 4,when the code which dynamically generated by calcilte deals tuples using filter,it
first judges whether the column is null.Because filter column in the tuple contains null value,so
it always return false, no tuples will return.
> h1. Solution
> when the filter column value is a invalid means not in the table,turn the CompareTupleFiter
into IS_NOT_NULL filter,instead of ConstantTupleFilter.TURE
>  
> Now I have test the feature in our production environment ;
> test in “mvn test” had passed,but not test in sandbox
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message