drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victoria Markman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2233) IOB exception when scalar subquery is used in the IN clause
Date Thu, 12 Feb 2015 22:24:11 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Victoria Markman updated DRILL-2233:
------------------------------------
    Description: 
Failing query:
{code}
SELECT a1, 
       COUNT(*) 
FROM   t1 
WHERE  ( a1 ) IN (SELECT MAX(a2) 
                               FROM   t2) 
GROUP  BY a1 
ORDER  BY a1; 
{code}

{code}
0: jdbc:drill:schema=dfs> select a1, count(*) from t1 where (a1) in (select  min(a2) from
t2) group by a1 order by a1;
+------------+------------+
|     a1     |   EXPR$1   |
+------------+------------+
Query failed: RemoteRpcException: Failure while running fragment., index: 0, length: 1 (expected:
range(0, 0)) [ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010 ]
[ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Query plan:
{code}
00-01      Project(a1=[$0], EXPR$1=[$1])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], dir0=[ASC])
00-04            Project(a1=[$0], EXPR$1=[$1])
00-05              HashAgg(group=[{0}], EXPR$1=[COUNT()])
00-06                Project($f1=[$0])
00-07                  HashJoin(condition=[=($1, $2)], joinType=[inner])
00-09                    Project($f1=[$0], $f2=[$0])
00-11                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/t1]],
selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`]]])
00-08                    HashAgg(group=[{0}])
00-10                      StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-12                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/aggregation/t2]], selectionRoot=/aggregation/t2, numFiles=1, columns=[`a2`]]])
{code}

Same query with "not in" works correctly:
{code}
0: jdbc:drill:schema=dfs> select a1, count(*) from t1 where (a1) not in (select  max(a2)
from t2) group by a1 order by a1;
+------------+------------+
|     a1     |   EXPR$1   |
+------------+------------+
| 1          | 1          |
| 2          | 1          |
| 3          | 1          |
| 4          | 1          |
| 5          | 1          |
| 6          | 1          |
| 7          | 1          |
| 10         | 1          |
+------------+------------+
8 rows selected (0.182 seconds)
{code}

I'm choosing "Execution-general" component, because query plan at the first glance looks correct.


  was:
Failing query:
{code}
SELECT a1, 
       COUNT(*) 
FROM   t1 
WHERE  ( a1 ) IN (SELECT MAX(a2) 
                               FROM   t2) 
GROUP  BY a1 
ORDER  BY a1; 
{code}

{code}
0: jdbc:drill:schema=dfs> select a1, count(*) from t1 where (a1) in (select  min(a2) from
t2) group by a1 order by a1;
+------------+------------+
|     a1     |   EXPR$1   |
+------------+------------+
Query failed: RemoteRpcException: Failure while running fragment., index: 0, length: 1 (expected:
range(0, 0)) [ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010 ]
[ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Query plan:
{code}
00-01      Project(a1=[$0], EXPR$1=[$1])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], dir0=[ASC])
00-04            Project(a1=[$0], EXPR$1=[$1])
00-05              HashAgg(group=[{0}], EXPR$1=[COUNT()])
00-06                Project($f1=[$0])
00-07                  HashJoin(condition=[=($1, $2)], joinType=[inner])
00-09                    Project($f1=[$0], $f2=[$0])
00-11                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/t1]],
selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`]]])
00-08                    HashAgg(group=[{0}])
00-10                      StreamAgg(group=[{}], EXPR$0=[MIN($0)])
00-12                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/aggregation/t2]], selectionRoot=/aggregation/t2, numFiles=1, columns=[`a2`]]])
{code}

I'm choosing "Execution-general" component, because query plan at the first glance looks correct.


> IOB exception when scalar subquery is used in the IN clause
> -----------------------------------------------------------
>
>                 Key: DRILL-2233
>                 URL: https://issues.apache.org/jira/browse/DRILL-2233
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>            Reporter: Victoria Markman
>            Assignee: Chris Westin
>         Attachments: drillbit.log, t1.parquet
>
>
> Failing query:
> {code}
> SELECT a1, 
>        COUNT(*) 
> FROM   t1 
> WHERE  ( a1 ) IN (SELECT MAX(a2) 
>                                FROM   t2) 
> GROUP  BY a1 
> ORDER  BY a1; 
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a1, count(*) from t1 where (a1) in (select  min(a2)
from t2) group by a1 order by a1;
> +------------+------------+
> |     a1     |   EXPR$1   |
> +------------+------------+
> Query failed: RemoteRpcException: Failure while running fragment., index: 0, length:
1 (expected: range(0, 0)) [ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010
]
> [ 00b3ed27-67be-4343-849b-b9b783cabe07 on atsqa4-133.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>         at sqlline.SqlLine.print(SqlLine.java:1809)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>         at sqlline.SqlLine.dispatch(SqlLine.java:889)
>         at sqlline.SqlLine.begin(SqlLine.java:763)
>         at sqlline.SqlLine.start(SqlLine.java:498)
>         at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Query plan:
> {code}
> 00-01      Project(a1=[$0], EXPR$1=[$1])
> 00-02        SelectionVectorRemover
> 00-03          Sort(sort0=[$0], dir0=[ASC])
> 00-04            Project(a1=[$0], EXPR$1=[$1])
> 00-05              HashAgg(group=[{0}], EXPR$1=[COUNT()])
> 00-06                Project($f1=[$0])
> 00-07                  HashJoin(condition=[=($1, $2)], joinType=[inner])
> 00-09                    Project($f1=[$0], $f2=[$0])
> 00-11                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/aggregation/t1]], selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`]]])
> 00-08                    HashAgg(group=[{0}])
> 00-10                      StreamAgg(group=[{}], EXPR$0=[MIN($0)])
> 00-12                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:/aggregation/t2]], selectionRoot=/aggregation/t2, numFiles=1, columns=[`a2`]]])
> {code}
> Same query with "not in" works correctly:
> {code}
> 0: jdbc:drill:schema=dfs> select a1, count(*) from t1 where (a1) not in (select  max(a2)
from t2) group by a1 order by a1;
> +------------+------------+
> |     a1     |   EXPR$1   |
> +------------+------------+
> | 1          | 1          |
> | 2          | 1          |
> | 3          | 1          |
> | 4          | 1          |
> | 5          | 1          |
> | 6          | 1          |
> | 7          | 1          |
> | 10         | 1          |
> +------------+------------+
> 8 rows selected (0.182 seconds)
> {code}
> I'm choosing "Execution-general" component, because query plan at the first glance looks
correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message