drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-1957) NULL values in NOT IN sub-queries could lead to incorrect query result.
Date Fri, 24 Apr 2015 17:42:39 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aman Sinha updated DRILL-1957:
------------------------------
    Assignee: Jinfeng Ni  (was: Aman Sinha)

> NULL values in NOT IN sub-queries could lead to incorrect query result. 
> ------------------------------------------------------------------------
>
>                 Key: DRILL-1957
>                 URL: https://issues.apache.org/jira/browse/DRILL-1957
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>            Priority: Critical
>             Fix For: 0.9.0
>
>
> NULL values in NOT IN sub-queries will disqualify all the rows, and hence return 0 row
for a query. However, Drill could return some rows, due to issue in the query planning.
> For example, consider the following sample data :
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` ;
> +------------+
> |   deptno   |
> +------------+
> | 10         |
> | 20         |
> | null       |
> +------------+
> 3 rows selected (0.062 seconds)
> 0: jdbc:drill:zk=local> select deptno from dfs.`/Users/jni/work/data/json/d.json`
;
> +------------+
> |   deptno   |
> +------------+
> | 10         |
> | null       |
> +------------+
> 2 rows selected (0.062 seconds)
> {code}
> The following query should return 0 row, but Drill actually return 1 row.
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` where deptno not in (select
deptno from dfs.`/Users/jni/work/data/json/d.json`);
> +------------+
> |   deptno   |
> +------------+
> | 20         |
> +------------+
> 1 row selected (0.286 seconds)
> {code}
> In comparison, here is the result run on posture.
> {code}
> mydb=# select * from e;
>  deptno | ename
> --------+-------
>      10 | Alice
>    NULL | Tom
>      20 | Bob
> (3 rows)
> mydb=# delete from d where deptno = 20;
> DELETE 1
> mydb=# select * from d;
>  deptno | dname
> --------+-------
>      10 | Sales
>    NULL | Eng
> (2 rows)
> mydb=# select deptno from e where deptno not in (select deptno from d);
>  deptno
> --------
> (0 rows)
> {code}
> The fix seems to be in OPTIQ-373, which will produce a valid plan. However, such plan
contain a cross-join, which is not supported Drill, and hence hit CanNotPlanException in Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message