drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-4825) Wrong data with UNION ALL when querying different sub-directories under the same table
Date Thu, 04 Aug 2016 00:52:20 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406902#comment-15406902
] 

Aman Sinha edited comment on DRILL-4825 at 8/4/16 12:51 AM:
------------------------------------------------------------

I found that the incorrect results is caused by the fix for DRILL-2517.  The commit # 03197d0
before this patch produces correct results: 
{noformat}
// correct result
0: jdbc:drill:zk=local> select * from (select min(o_custkey) as x from dfs.`multilevel/parquet`
where dir0 = 1994) inner join (select min(o_custkey) as y from dfs.`multilevel/parquet` where
dir0 = 1996) on x = y;
+----+----+
| x  | y  |
+----+----+
+----+----+
No rows selected (0.726 seconds)
{noformat}

The commit # 9b4008d corresponding to DRILL-2517 shows incorrect results:
{noformat}
// incorrect result
0: jdbc:drill:zk=local> select * from (select min(o_custkey) as x from dfs.`multilevel/parquet`
where dir0 = 1994) inner join (select min(o_custkey) as y from dfs.`/Users/asinha/data/multilevel/parquet`
where dir0 = 1996) on x = y;
+-----+-----+
|  x  |  y  |
+-----+-----+
| 25  | 25  |
+-----+-----+
1 row selected (0.699 seconds)
{noformat}



was (Author: amansinha100):
I found that the incorrect results is caused by the fix for DRILL-2517.  The commit # 03197d0
before this patch produces correct results: 
{noformat}
// correct result
0: jdbc:drill:zk=local> select * from (select min(o_custkey) as x from dfs.`multilevel/parquet`
where dir0 = 1994) inner join (select min(o_custkey) as y from dfs.`multilevel/parquet` where
dir0 = 1996) on x = y;
+----+----+
| x  | y  |
+----+----+
+----+----+
No rows selected (0.726 seconds)
{noformat}

> Wrong data with UNION ALL when querying different sub-directories under the same table
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-4825
>                 URL: https://issues.apache.org/jira/browse/DRILL-4825
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.8.0
>            Reporter: Rahul Challapalli
>            Priority: Critical
>         Attachments: l_3level.tgz
>
>
> git.commit.id.abbrev=0700c6b
> The below query returns wrongs results 
> {code}
> select count (*) from (
>   select l_orderkey, dir0 from l_3level t1 where t1.dir0 = 1 and t1.dir1='one' and t1.dir2
= '2015-7-12'
>   union all 
>   select l_orderkey, dir0 from l_3level t2 where t2.dir0 = 1 and t2.dir1='two' and t2.dir2
= '2015-8-12') data;
> +---------+
> | EXPR$0  |
> +---------+
> | 20      |
> +---------+
> {code}
> The wrong result is evident from the output of the below queries
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select l_orderkey, dir0
from l_3level t2 where t2.dir0 = 1 and t2.dir1='two' and t2.dir2 = '2015-8-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 30      |
> +---------+
> 1 row selected (0.258 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count (*) from (select l_orderkey, dir0
from l_3level t2 where t2.dir0 = 1 and t2.dir1='one' and t2.dir2 = '2015-7-12');
> +---------+
> | EXPR$0  |
> +---------+
> | 10      |
> +---------+
> {code}
> I attached the data set. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message