drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victoria Markman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-3188) Disable row range functionality for window functions
Date Tue, 26 May 2015 23:58:17 GMT

     [ https://issues.apache.org/jira/browse/DRILL-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Victoria Markman updated DRILL-3188:
------------------------------------
    Description: 
We don't support row range with window functions. So we should disable this functionality,
because currently we return default frame result.
The only frame we currently support is BETWEEN UNBOUNDED PRECEDING  AND CURRENT ROW.

If you don't specify frame in Calcite, this is exactly what you get:

{code}
0: jdbc:drill:schema=dfs> explain plan for select a2,b2,c2, cast(count(*) over(partition
by a2, substr(b2,1,2),c2 order by cast(a2 as double) + 100 ) as bigint) from t2 order by a2;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$3])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], dir0=[ASC])
00-04            Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$5])
00-05              Window(window#0=[window(partition {0, 2, 4} order by [3] range between
UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$3], dir0=[ASC], dir1=[ASC],
dir2=[ASC], dir3=[ASC])
00-08                    Project(a2=[$1], b2=[$0], c2=[$2], $3=[+(CAST($1):DOUBLE, 100)],
$4=[SUBSTR($0, 1, 2)])
00-09                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/aggregation/t2]],
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`a2`, `b2`, `c2`]]])
{code}

Row range should be disabled as well:
{code}
0: jdbc:drill:schema=dfs> select * from t2;
+-----+--------+-------------+
| a2  |   b2   |     c2      |
+-----+--------+-------------+
| 0   | zzz    | 2014-12-31  |
| 1   | aaaaa  | 2015-01-01  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 3   | ccccc  | 2015-01-03  |
| 4   | ddddd  | 2015-01-04  |
| 5   | eeeee  | 2015-01-05  |
| 6   | fffff  | 2015-01-06  |
| 7   | ggggg  | 2015-01-07  |
| 7   | ggggg  | 2015-01-07  |
| 8   | hhhhh  | 2015-01-08  |
| 9   | iiiii  | 2015-01-09  |
+-----+--------+-------------+
13 rows selected (0.123 seconds)

0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 rows between
1 preceding and 1 following ) from t2 order by a2;
+-----+---------+
| a2  | EXPR$1  |
+-----+---------+
| 0   | 0       |
| 1   | 1       |
| 2   | 6       |
| 2   | 6       |
| 2   | 6       |
| 3   | 3       |
| 4   | 4       |
| 5   | 5       |
| 6   | 6       |
| 7   | 14      |
| 7   | 14      |
| 8   | 8       |
| 9   | 9       |
+-----+---------+
13 rows selected (0.2 seconds)
{code}


  was:
We don't support row range with window functions. So we should disable this functionality,
because currently we return default frame result.

{code}
0: jdbc:drill:schema=dfs> select * from t2;
+-----+--------+-------------+
| a2  |   b2   |     c2      |
+-----+--------+-------------+
| 0   | zzz    | 2014-12-31  |
| 1   | aaaaa  | 2015-01-01  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 3   | ccccc  | 2015-01-03  |
| 4   | ddddd  | 2015-01-04  |
| 5   | eeeee  | 2015-01-05  |
| 6   | fffff  | 2015-01-06  |
| 7   | ggggg  | 2015-01-07  |
| 7   | ggggg  | 2015-01-07  |
| 8   | hhhhh  | 2015-01-08  |
| 9   | iiiii  | 2015-01-09  |
+-----+--------+-------------+
13 rows selected (0.123 seconds)

0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 rows between
1 preceding and 1 following ) from t2 order by a2;
+-----+---------+
| a2  | EXPR$1  |
+-----+---------+
| 0   | 0       |
| 1   | 1       |
| 2   | 6       |
| 2   | 6       |
| 2   | 6       |
| 3   | 3       |
| 4   | 4       |
| 5   | 5       |
| 6   | 6       |
| 7   | 14      |
| 7   | 14      |
| 8   | 8       |
| 9   | 9       |
+-----+---------+
13 rows selected (0.2 seconds)
{code}



> Disable row range functionality for window functions
> ----------------------------------------------------
>
>                 Key: DRILL-3188
>                 URL: https://issues.apache.org/jira/browse/DRILL-3188
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.0.0
>            Reporter: Victoria Markman
>            Assignee: Jinfeng Ni
>            Priority: Critical
>
> We don't support row range with window functions. So we should disable this functionality,
because currently we return default frame result.
> The only frame we currently support is BETWEEN UNBOUNDED PRECEDING  AND CURRENT ROW.
> If you don't specify frame in Calcite, this is exactly what you get:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select a2,b2,c2, cast(count(*) over(partition
by a2, substr(b2,1,2),c2 order by cast(a2 as double) + 100 ) as bigint) from t2 order by a2;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$3])
> 00-02        SelectionVectorRemover
> 00-03          Sort(sort0=[$0], dir0=[ASC])
> 00-04            Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$5])
> 00-05              Window(window#0=[window(partition {0, 2, 4} order by [3] range between
UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
> 00-06                SelectionVectorRemover
> 00-07                  Sort(sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$3], dir0=[ASC],
dir1=[ASC], dir2=[ASC], dir3=[ASC])
> 00-08                    Project(a2=[$1], b2=[$0], c2=[$2], $3=[+(CAST($1):DOUBLE, 100)],
$4=[SUBSTR($0, 1, 2)])
> 00-09                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=maprfs:///drill/testdata/aggregation/t2]], selectionRoot=/drill/testdata/aggregation/t2,
numFiles=1, columns=[`a2`, `b2`, `c2`]]])
> {code}
> Row range should be disabled as well:
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2;
> +-----+--------+-------------+
> | a2  |   b2   |     c2      |
> +-----+--------+-------------+
> | 0   | zzz    | 2014-12-31  |
> | 1   | aaaaa  | 2015-01-01  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 3   | ccccc  | 2015-01-03  |
> | 4   | ddddd  | 2015-01-04  |
> | 5   | eeeee  | 2015-01-05  |
> | 6   | fffff  | 2015-01-06  |
> | 7   | ggggg  | 2015-01-07  |
> | 7   | ggggg  | 2015-01-07  |
> | 8   | hhhhh  | 2015-01-08  |
> | 9   | iiiii  | 2015-01-09  |
> +-----+--------+-------------+
> 13 rows selected (0.123 seconds)
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 rows
between 1 preceding and 1 following ) from t2 order by a2;
> +-----+---------+
> | a2  | EXPR$1  |
> +-----+---------+
> | 0   | 0       |
> | 1   | 1       |
> | 2   | 6       |
> | 2   | 6       |
> | 2   | 6       |
> | 3   | 3       |
> | 4   | 4       |
> | 5   | 5       |
> | 6   | 6       |
> | 7   | 14      |
> | 7   | 14      |
> | 8   | 8       |
> | 9   | 9       |
> +-----+---------+
> 13 rows selected (0.2 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message