drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victoria Markman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3567) Wrong result in a query with multiple window functions and different over clauses
Date Mon, 27 Jul 2015 22:48:04 GMT
Victoria Markman created DRILL-3567:
---------------------------------------

             Summary: Wrong result in a query with multiple window functions and different
over clauses
                 Key: DRILL-3567
                 URL: https://issues.apache.org/jira/browse/DRILL-3567
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.1.0
         Environment: private-branch-with-multiple-partitions-enabled
            Reporter: Victoria Markman
            Assignee: Jinfeng Ni
            Priority: Critical


{code}
0: jdbc:drill:drillbit=localhost> select * from t1;
+-------+--------+-------------+
|  a1   |   b1   |     c1      |
+-------+--------+-------------+
| 1     | aaaaa  | 2015-01-01  |
| 2     | bbbbb  | 2015-01-02  |
| 3     | ccccc  | 2015-01-03  |
| 4     | null   | 2015-01-04  |
| 5     | eeeee  | 2015-01-05  |
| 6     | fffff  | 2015-01-06  |
| 7     | ggggg  | 2015-01-07  |
| null  | hhhhh  | 2015-01-08  |
| 9     | iiiii  | null        |
| 10    | jjjjj  | 2015-01-10  |
+-------+--------+-------------+
10 rows selected (0.078 seconds)
{code}

Wrong result, columns are projected in the wrong order:
{code}
0: jdbc:drill:drillbit=localhost> select
. . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as
count1,
. . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1) as
count2,
. . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as
sum1
. . . . . . . . . . . . . . . . > from 
. . . . . . . . . . . . . . . . >         t1;
+---------+---------+-------+
| count1  | count2  | sum1  |
+---------+---------+-------+
| 1       | 1       | 1     |
| 1       | 2       | 1     |
| 1       | 3       | 1     |
| 1       | 4       | 1     |
| 1       | 5       | 1     |
| 1       | 6       | 1     |
| 1       | 7       | 1     |
| 1       | 9       | 1     |
| 1       | 10      | 1     |
| 1       | null    | 1     |
+---------+---------+-------+
10 rows selected (0.113 seconds)
{code}

Explain plan:
{code}
0: jdbc:drill:drillbit=localhost> explain plan for select
. . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as
count1,
. . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1) as
count2,
. . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as
sum1
. . . . . . . . . . . . . . . . > from 
. . . . . . . . . . . . . . . . >         t1;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      ProjectAllowDup(count1=[$0], count2=[$1], sum1=[$2])
00-02        Project(w0$o0=[$4], w0$o1=[$5], w1$o0=[$6])
00-03          Window(window#0=[window(partition {3} order by [2] range between UNBOUNDED
PRECEDING and CURRENT ROW aggs [COUNT()])])
00-04            SelectionVectorRemover
00-05              Sort(sort0=[$3], sort1=[$2], dir0=[ASC], dir1=[ASC])
00-06                Window(window#0=[window(partition {1} order by [2] range between UNBOUNDED
PRECEDING and CURRENT ROW aggs [COUNT(), SUM($3)])])
00-07                  SelectionVectorRemover
00-08                    Sort(sort0=[$1], sort1=[$2], dir0=[ASC], dir1=[ASC])
00-09                      Project(T61¦¦*=[$0], b1=[$1], c1=[$2], a1=[$3])
00-10                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=file:/Users/vmarkman/drill/testdata/subqueries/t1]], selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1,
numFiles=1, columns=[`*`]]])
{code}

If you remove frame that is not the same as other two, query works correctly:

{code}
0: jdbc:drill:drillbit=localhost> select
. . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as
count1,
. . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as
sum1
. . . . . . . . . . . . . . . . > from
. . . . . . . . . . . . . . . . >         t1;
+---------+-------+
| count1  | sum1  |
+---------+-------+
| 1       | 1     |
| 1       | 2     |
| 1       | 3     |
| 1       | 5     |
| 1       | 6     |
| 1       | 7     |
| 1       | null  |
| 1       | 9     |
| 1       | 10    |
| 1       | 4     |
+---------+-------+
10 rows selected (0.099 seconds)
{code}

and in the different order (just for fun) :
{code}
0: jdbc:drill:drillbit=localhost> select
. . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as
sum1,
. . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as
count1
. . . . . . . . . . . . . . . . > from 
. . . . . . . . . . . . . . . . >         t1;
+-------+---------+
| sum1  | count1  |
+-------+---------+
| 1     | 1       |
| 2     | 1       |
| 3     | 1       |
| 5     | 1       |
| 6     | 1       |
| 7     | 1       |
| null  | 1       |
| 9     | 1       |
| 10    | 1       |
| 4     | 1       |
+-------+---------+
10 rows selected (0.096 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message