drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3567) Wrong result in a query with multiple window functions and different over clauses
Date Wed, 05 Aug 2015 05:00:09 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654815#comment-14654815
] 

Sean Hsuan-Yi Chu commented on DRILL-3567:
------------------------------------------

https://reviews.apache.org/r/37116/

> Wrong result in a query with multiple window functions and different over clauses
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-3567
>                 URL: https://issues.apache.org/jira/browse/DRILL-3567
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.1.0
>         Environment: private-branch-with-multiple-partitions-enabled
>            Reporter: Victoria Markman
>            Assignee: Sean Hsuan-Yi Chu
>            Priority: Critical
>              Labels: window_function
>         Attachments: t1_parquet
>
>
> {code}
> 0: jdbc:drill:drillbit=localhost> select * from t1;
> +-------+--------+-------------+
> |  a1   |   b1   |     c1      |
> +-------+--------+-------------+
> | 1     | aaaaa  | 2015-01-01  |
> | 2     | bbbbb  | 2015-01-02  |
> | 3     | ccccc  | 2015-01-03  |
> | 4     | null   | 2015-01-04  |
> | 5     | eeeee  | 2015-01-05  |
> | 6     | fffff  | 2015-01-06  |
> | 7     | ggggg  | 2015-01-07  |
> | null  | hhhhh  | 2015-01-08  |
> | 9     | iiiii  | null        |
> | 10    | jjjjj  | 2015-01-10  |
> +-------+--------+-------------+
> 10 rows selected (0.078 seconds)
> {code}
> Wrong result, columns are projected in the wrong order:
> {code}
> 0: jdbc:drill:drillbit=localhost> select
> . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1)
as count1,
> . . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1)
as count2,
> . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1)
as sum1
> . . . . . . . . . . . . . . . . > from 
> . . . . . . . . . . . . . . . . >         t1;
> +---------+---------+-------+
> | count1  | count2  | sum1  |
> +---------+---------+-------+
> | 1       | 1       | 1     |
> | 1       | 2       | 1     |
> | 1       | 3       | 1     |
> | 1       | 4       | 1     |
> | 1       | 5       | 1     |
> | 1       | 6       | 1     |
> | 1       | 7       | 1     |
> | 1       | 9       | 1     |
> | 1       | 10      | 1     |
> | 1       | null    | 1     |
> +---------+---------+-------+
> 10 rows selected (0.113 seconds)
> {code}
> Explain plan:
> {code}
> 0: jdbc:drill:drillbit=localhost> explain plan for select
> . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1)
as count1,
> . . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1)
as count2,
> . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1)
as sum1
> . . . . . . . . . . . . . . . . > from 
> . . . . . . . . . . . . . . . . >         t1;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      ProjectAllowDup(count1=[$0], count2=[$1], sum1=[$2])
> 00-02        Project(w0$o0=[$4], w0$o1=[$5], w1$o0=[$6])
> 00-03          Window(window#0=[window(partition {3} order by [2] range between UNBOUNDED
PRECEDING and CURRENT ROW aggs [COUNT()])])
> 00-04            SelectionVectorRemover
> 00-05              Sort(sort0=[$3], sort1=[$2], dir0=[ASC], dir1=[ASC])
> 00-06                Window(window#0=[window(partition {1} order by [2] range between
UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT(), SUM($3)])])
> 00-07                  SelectionVectorRemover
> 00-08                    Sort(sort0=[$1], sort1=[$2], dir0=[ASC], dir1=[ASC])
> 00-09                      Project(T61¦¦*=[$0], b1=[$1], c1=[$2], a1=[$3])
> 00-10                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=file:/Users/vmarkman/drill/testdata/subqueries/t1]], selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1,
numFiles=1, columns=[`*`]]])
> {code}
> If you remove frame that is not the same as other two, query works correctly:
> {code}
> 0: jdbc:drill:drillbit=localhost> select
> . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1)
as count1,
> . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1)
as sum1
> . . . . . . . . . . . . . . . . > from
> . . . . . . . . . . . . . . . . >         t1;
> +---------+-------+
> | count1  | sum1  |
> +---------+-------+
> | 1       | 1     |
> | 1       | 2     |
> | 1       | 3     |
> | 1       | 5     |
> | 1       | 6     |
> | 1       | 7     |
> | 1       | null  |
> | 1       | 9     |
> | 1       | 10    |
> | 1       | 4     |
> +---------+-------+
> 10 rows selected (0.099 seconds)
> {code}
> and in the different order (just for fun) :
> {code}
> 0: jdbc:drill:drillbit=localhost> select
> . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1)
as sum1,
> . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1)
as count1
> . . . . . . . . . . . . . . . . > from 
> . . . . . . . . . . . . . . . . >         t1;
> +-------+---------+
> | sum1  | count1  |
> +-------+---------+
> | 1     | 1       |
> | 2     | 1       |
> | 3     | 1       |
> | 5     | 1       |
> | 6     | 1       |
> | 7     | 1       |
> | null  | 1       |
> | 9     | 1       |
> | 10    | 1       |
> | 4     | 1       |
> +-------+---------+
> 10 rows selected (0.096 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message