drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2967) Incompatible types error reported in a "not in" query with compatible data types
Date Tue, 26 May 2015 19:39:17 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Hsuan-Yi Chu updated DRILL-2967:
-------------------------------------
    Description: 
Two tables, parquet files (attached in the bug):
{code}
0: jdbc:drill:schema=dfs> select * from t1;
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
| 1          | aaaaa      | 2015-01-01 |
| 2          | bbbbb      | 2015-01-02 |
| 3          | ccccc      | 2015-01-03 |
| 4          | null       | 2015-01-04 |
| 5          | eeeee      | 2015-01-05 |
| 6          | fffff      | 2015-01-06 |
| 7          | ggggg      | 2015-01-07 |
| null       | hhhhh      | 2015-01-08 |
| 9          | iiiii      | null       |
| 10         | jjjjj      | 2015-01-10 |
+------------+------------+------------+
10 rows selected (0.119 seconds)

0: jdbc:drill:schema=dfs> select * from t2;
+------------+------------+------------+
|     a2     |     b2     |     c2     |
+------------+------------+------------+
| 0          | zzz        | 2014-12-31 |
| 1          | aaaaa      | 2015-01-01 |
| 2          | bbbbb      | 2015-01-02 |
| 2          | bbbbb      | 2015-01-02 |
| 2          | bbbbb      | 2015-01-02 |
| 3          | ccccc      | 2015-01-03 |
| 4          | ddddd      | 2015-01-04 |
| 5          | eeeee      | 2015-01-05 |
| 6          | fffff      | 2015-01-06 |
| 7          | ggggg      | 2015-01-07 |
| 7          | ggggg      | 2015-01-07 |
| 8          | hhhhh      | 2015-01-08 |
| 9          | iiiii      | 2015-01-09 |
+------------+------------+------------+
13 rows selected (0.116 seconds)
{code}

Disable hash join and set slice_target = 1:

alter session set `planner.enable_hashjoin` = false;
alter session set `planner.slice_target` = 1;

Correct result:
{code}
0: jdbc:drill:schema=dfs> select * from t1 where b1 not in (select b2 from t2);
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
| 10         | jjjjj      | 2015-01-10 |
+------------+------------+------------+
1 row selected (0.625 seconds)
{code}


Swap tables and you get an error:
{code}
0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
Query failed: SYSTEM ERROR: Join only supports implicit casts between 1. Numeric data
 2. Varchar, Varbinary data Left type: INT, Right type: VARCHAR. Add explicit casts to avoid
this error

Fragment 1:0

[1a83aa50-39aa-452c-91dd-970bf4a8f03d on atsqa4-133.qa.lab:31010]
java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Explain plan for the query with an error:
{code} 
0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select b1
from t1); 
+------------+------------+ 
| text | json | 
+------------+------------+ 
| 00-00 Screen 
00-01 Project(*=[$0]) 
00-02 UnionExchange 
01-01 Project(T27¦¦*=[$0]) 
01-02 SelectionVectorRemover 
01-03 Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS NULL($4), null,
<($3, $2), null, false))]) 
01-04 MergeJoin(condition=[=($4, $5)], joinType=[left]) 
01-06 SelectionVectorRemover 
01-08 Sort(sort0=[$4], dir0=[ASC]) 
01-10 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4]) 
01-12 HashToRandomExchange(dist0=[[$4]]) 
02-01 UnorderedMuxExchange 
04-01 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($4))])

04-02 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1]) 
04-03 NestedLoopJoin(condition=[true], joinType=[inner]) 
04-05 Project(T27¦¦*=[$0], b2=[$1]) 
04-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t2]],
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]]) 
04-04 BroadcastExchange 
06-01 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)]) 
06-02 UnionExchange 
07-01 StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) 
07-02 Project(b1=[$0], $f1=[true]) 
07-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]],
selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) 
01-05 Project(b1=[$0], $f10=[$1]) 
01-07 SelectionVectorRemover 
01-09 Sort(sort0=[$0], dir0=[ASC]) 
01-11 HashAgg(group=[{0}], agg#0=[MIN($1)]) 
01-13 Project(b1=[$0], $f1=[$1]) 
01-14 HashToRandomExchange(dist0=[[$0]]) 
03-01 UnorderedMuxExchange 
05-01 Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])

05-02 HashAgg(group=[{0}], agg#0=[MIN($1)]) 
05-03 Project(b1=[$0], $f1=[true]) 
05-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]],
selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) 
{code} 

Correct result and correct plan with hash join distributed plan ( planner.slice_target = 1)

alter session set `planner.enable_hashjoin` = true;
alter session set `planner.slice_target` = 1;

{code}
0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
+------------+------------+------------+
|     a2     |     b2     |     c2     |
+------------+------------+------------+
+------------+------------+------------+
No rows selected (0.458 seconds)

0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select b1
from t1);
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      Project(*=[$0])
00-02        Project(T25¦¦*=[$0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS NULL($4),
null, <($3, $2), null, false))])
00-05              HashJoin(condition=[=($4, $5)], joinType=[left])
00-07                Project(T25¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1])
00-09                  NestedLoopJoin(condition=[true], joinType=[inner])
00-11                    Project(T25¦¦*=[$0], b2=[$1])
00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t2]],
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]])
00-10                    BroadcastExchange
01-01                      StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
01-02                        UnionExchange
03-01                          StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
03-02                            Project(b1=[$0], $f1=[true])
03-03                              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
00-06                Project(b1=[$0], $f10=[$1])
00-08                  BroadcastExchange
02-01                    HashAgg(group=[{0}], agg#0=[MIN($1)])
02-02                      Project(b1=[$0], $f1=[$1])
02-03                        HashToRandomExchange(dist0=[[$0]])
04-01                          UnorderedMuxExchange
05-01                            Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
05-02                              HashAgg(group=[{0}], agg#0=[MIN($1)])
05-03                                Project(b1=[$0], $f1=[true])
05-04                                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
{code}

Same error with the columns of date, time and timestamp types.


  was:
Two tables, parquet files (attached in the bug):
{code}
0: jdbc:drill:schema=dfs> select * from t1;
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
| 1          | aaaaa      | 2015-01-01 |
| 2          | bbbbb      | 2015-01-02 |
| 3          | ccccc      | 2015-01-03 |
| 4          | null       | 2015-01-04 |
| 5          | eeeee      | 2015-01-05 |
| 6          | fffff      | 2015-01-06 |
| 7          | ggggg      | 2015-01-07 |
| null       | hhhhh      | 2015-01-08 |
| 9          | iiiii      | null       |
| 10         | jjjjj      | 2015-01-10 |
+------------+------------+------------+
10 rows selected (0.119 seconds)

0: jdbc:drill:schema=dfs> select * from t2;
+------------+------------+------------+
|     a2     |     b2     |     c2     |
+------------+------------+------------+
| 0          | zzz        | 2014-12-31 |
| 1          | aaaaa      | 2015-01-01 |
| 2          | bbbbb      | 2015-01-02 |
| 2          | bbbbb      | 2015-01-02 |
| 2          | bbbbb      | 2015-01-02 |
| 3          | ccccc      | 2015-01-03 |
| 4          | ddddd      | 2015-01-04 |
| 5          | eeeee      | 2015-01-05 |
| 6          | fffff      | 2015-01-06 |
| 7          | ggggg      | 2015-01-07 |
| 7          | ggggg      | 2015-01-07 |
| 8          | hhhhh      | 2015-01-08 |
| 9          | iiiii      | 2015-01-09 |
+------------+------------+------------+
13 rows selected (0.116 seconds)
{code}

Disable hash join and set slice_target = 1:

alter session set `planner.enable_hashjoin` = false;
alter session set `planner.slice_target` = 1;

Correct result:
{code}
0: jdbc:drill:schema=dfs> select * from t1 where b1 not in (select b2 from t2);
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
| 10         | jjjjj      | 2015-01-10 |
+------------+------------+------------+
1 row selected (0.625 seconds)
{code}


Swap tables and you get an error:
{code}
0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
+------------+------------+------------+
|     a1     |     b1     |     c1     |
+------------+------------+------------+
Query failed: SYSTEM ERROR: Join only supports implicit casts between 1. Numeric data
 2. Varchar, Varbinary data Left type: INT, Right type: VARCHAR. Add explicit casts to avoid
this error

Fragment 1:0

[1a83aa50-39aa-452c-91dd-970bf4a8f03d on atsqa4-133.qa.lab:31010]
java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Explain plan for the query with an error:
{code}
0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select b1
from t1);
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      Project(*=[$0])
00-02        UnionExchange
01-01          Project(T27¦¦*=[$0])
01-02            SelectionVectorRemover
01-03              Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS NULL($4),
null, <($3, $2), null, false))])
01-04                MergeJoin(condition=[=($4, $5)], joinType=[left])
01-06                  SelectionVectorRemover
01-08                    Sort(sort0=[$4], dir0=[ASC])
04-03                                NestedLoopJoin(condition=[true], joinType=[inner])
04-05                                  Project(T27¦¦*=[$0], b2=[$1])
04-06                                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t2]], selectionRoot=/drill/testdata/aggregation/t2,
numFiles=1, columns=[`*`]]])
04-04                                  BroadcastExchange
06-01                                    StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
06-02                                      UnionExchange
07-01                                        StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
07-02                                          Project(b1=[$0], $f1=[true])
07-03                                            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
01-05                  Project(b1=[$0], $f10=[$1])
01-07                    SelectionVectorRemover
01-09                      Sort(sort0=[$0], dir0=[ASC])
01-11                        HashAgg(group=[{0}], agg#0=[MIN($1)])
01-13                          Project(b1=[$0], $f1=[$1])
01-14                            HashToRandomExchange(dist0=[[$0]])
03-01                              UnorderedMuxExchange
05-01                                Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
05-02                                  HashAgg(group=[{0}], agg#0=[MIN($1)])
05-03                                    Project(b1=[$0], $f1=[true])
05-04                                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
{code}

Correct result and correct plan with hash join distributed plan ( planner.slice_target = 1)

alter session set `planner.enable_hashjoin` = true;
alter session set `planner.slice_target` = 1;

{code}
0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
+------------+------------+------------+
|     a2     |     b2     |     c2     |
+------------+------------+------------+
+------------+------------+------------+
No rows selected (0.458 seconds)

0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select b1
from t1);
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      Project(*=[$0])
00-02        Project(T25¦¦*=[$0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS NULL($4),
null, <($3, $2), null, false))])
00-05              HashJoin(condition=[=($4, $5)], joinType=[left])
00-07                Project(T25¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1])
00-09                  NestedLoopJoin(condition=[true], joinType=[inner])
00-11                    Project(T25¦¦*=[$0], b2=[$1])
00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t2]],
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]])
00-10                    BroadcastExchange
01-01                      StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
01-02                        UnionExchange
03-01                          StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
03-02                            Project(b1=[$0], $f1=[true])
03-03                              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
00-06                Project(b1=[$0], $f10=[$1])
00-08                  BroadcastExchange
02-01                    HashAgg(group=[{0}], agg#0=[MIN($1)])
02-02                      Project(b1=[$0], $f1=[$1])
02-03                        HashToRandomExchange(dist0=[[$0]])
04-01                          UnorderedMuxExchange
05-01                            Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
05-02                              HashAgg(group=[{0}], agg#0=[MIN($1)])
05-03                                Project(b1=[$0], $f1=[true])
05-04                                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
{code}

Same error with the columns of date, time and timestamp types.



> Incompatible types error reported in a "not in" query with compatible data types 
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-2967
>                 URL: https://issues.apache.org/jira/browse/DRILL-2967
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0
>            Reporter: Victoria Markman
>            Assignee: Sean Hsuan-Yi Chu
>             Fix For: 1.1.0
>
>         Attachments: t1.parquet, t2.parquet
>
>
> Two tables, parquet files (attached in the bug):
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1;
> +------------+------------+------------+
> |     a1     |     b1     |     c1     |
> +------------+------------+------------+
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | null       | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | null       | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | null       |
> | 10         | jjjjj      | 2015-01-10 |
> +------------+------------+------------+
> 10 rows selected (0.119 seconds)
> 0: jdbc:drill:schema=dfs> select * from t2;
> +------------+------------+------------+
> |     a2     |     b2     |     c2     |
> +------------+------------+------------+
> | 0          | zzz        | 2014-12-31 |
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | ddddd      | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | 7          | ggggg      | 2015-01-07 |
> | 8          | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | 2015-01-09 |
> +------------+------------+------------+
> 13 rows selected (0.116 seconds)
> {code}
> Disable hash join and set slice_target = 1:
> alter session set `planner.enable_hashjoin` = false;
> alter session set `planner.slice_target` = 1;
> Correct result:
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1 where b1 not in (select b2 from t2);
> +------------+------------+------------+
> |     a1     |     b1     |     c1     |
> +------------+------------+------------+
> | 10         | jjjjj      | 2015-01-10 |
> +------------+------------+------------+
> 1 row selected (0.625 seconds)
> {code}
> Swap tables and you get an error:
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
> +------------+------------+------------+
> |     a1     |     b1     |     c1     |
> +------------+------------+------------+
> Query failed: SYSTEM ERROR: Join only supports implicit casts between 1. Numeric data
>  2. Varchar, Varbinary data Left type: INT, Right type: VARCHAR. Add explicit casts to
avoid this error
> Fragment 1:0
> [1a83aa50-39aa-452c-91dd-970bf4a8f03d on atsqa4-133.qa.lab:31010]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>         at sqlline.SqlLine.print(SqlLine.java:1809)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>         at sqlline.SqlLine.dispatch(SqlLine.java:889)
>         at sqlline.SqlLine.begin(SqlLine.java:763)
>         at sqlline.SqlLine.start(SqlLine.java:498)
>         at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Explain plan for the query with an error:
> {code} 
> 0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select
b1 from t1); 
> +------------+------------+ 
> | text | json | 
> +------------+------------+ 
> | 00-00 Screen 
> 00-01 Project(*=[$0]) 
> 00-02 UnionExchange 
> 01-01 Project(T27¦¦*=[$0]) 
> 01-02 SelectionVectorRemover 
> 01-03 Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS NULL($4),
null, <($3, $2), null, false))]) 
> 01-04 MergeJoin(condition=[=($4, $5)], joinType=[left]) 
> 01-06 SelectionVectorRemover 
> 01-08 Sort(sort0=[$4], dir0=[ASC]) 
> 01-10 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4]) 
> 01-12 HashToRandomExchange(dist0=[[$4]]) 
> 02-01 UnorderedMuxExchange 
> 04-01 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($4))])

> 04-02 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1]) 
> 04-03 NestedLoopJoin(condition=[true], joinType=[inner]) 
> 04-05 Project(T27¦¦*=[$0], b2=[$1]) 
> 04-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t2]],
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]]) 
> 04-04 BroadcastExchange 
> 06-01 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)]) 
> 06-02 UnionExchange 
> 07-01 StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) 
> 07-02 Project(b1=[$0], $f1=[true]) 
> 07-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]],
selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) 
> 01-05 Project(b1=[$0], $f10=[$1]) 
> 01-07 SelectionVectorRemover 
> 01-09 Sort(sort0=[$0], dir0=[ASC]) 
> 01-11 HashAgg(group=[{0}], agg#0=[MIN($1)]) 
> 01-13 Project(b1=[$0], $f1=[$1]) 
> 01-14 HashToRandomExchange(dist0=[[$0]]) 
> 03-01 UnorderedMuxExchange 
> 05-01 Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])

> 05-02 HashAgg(group=[{0}], agg#0=[MIN($1)]) 
> 05-03 Project(b1=[$0], $f1=[true]) 
> 05-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]],
selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) 
> {code} 
> Correct result and correct plan with hash join distributed plan ( planner.slice_target
= 1)
> alter session set `planner.enable_hashjoin` = true;
> alter session set `planner.slice_target` = 1;
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from t1);
> +------------+------------+------------+
> |     a2     |     b2     |     c2     |
> +------------+------------+------------+
> +------------+------------+------------+
> No rows selected (0.458 seconds)
> 0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in (select
b1 from t1);
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(*=[$0])
> 00-02        Project(T25¦¦*=[$0])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS
NULL($4), null, <($3, $2), null, false))])
> 00-05              HashJoin(condition=[=($4, $5)], joinType=[left])
> 00-07                Project(T25¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1])
> 00-09                  NestedLoopJoin(condition=[true], joinType=[inner])
> 00-11                    Project(T25¦¦*=[$0], b2=[$1])
> 00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t2]], selectionRoot=/drill/testdata/aggregation/t2,
numFiles=1, columns=[`*`]]])
> 00-10                    BroadcastExchange
> 01-01                      StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
> 01-02                        UnionExchange
> 03-01                          StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
> 03-02                            Project(b1=[$0], $f1=[true])
> 03-03                              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
> 00-06                Project(b1=[$0], $f10=[$1])
> 00-08                  BroadcastExchange
> 02-01                    HashAgg(group=[{0}], agg#0=[MIN($1)])
> 02-02                      Project(b1=[$0], $f1=[$1])
> 02-03                        HashToRandomExchange(dist0=[[$0]])
> 04-01                          UnorderedMuxExchange
> 05-01                            Project(b1=[$0], $f1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
> 05-02                              HashAgg(group=[{0}], agg#0=[MIN($1)])
> 05-03                                Project(b1=[$0], $f1=[true])
> 05-04                                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1,
numFiles=1, columns=[`b1`]]])
> {code}
> Same error with the columns of date, time and timestamp types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message