drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khurram Faraaz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3689) incorrect results : aggregate AVG returns wrong results over results returned by LEAD function.
Date Sat, 22 Aug 2015 00:59:45 GMT
Khurram Faraaz created DRILL-3689:
-------------------------------------

             Summary: incorrect results : aggregate AVG returns wrong results over results
returned by LEAD function.
                 Key: DRILL-3689
                 URL: https://issues.apache.org/jira/browse/DRILL-3689
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.2.0
         Environment:  private-branch https://github.com/adeneche/incubator-drill/tree/new-window-funcs
            Reporter: Khurram Faraaz
            Assignee: Chris Westin
            Priority: Critical


Aggregate AVG returns wrong results over results returned by LEAD function.

results returned by Drill
{code}
0: jdbc:drill:schema=dfs.tmp> SELECT  avg(lead_col1) FROM (SELECT LEAD(col1) OVER(PARTITION
BY col7 ORDER BY col1) lead_col1 , col7 FROM FEWRWSPQQ_101) sub_query;
+-------------------------+
|         EXPR$0          |
+-------------------------+
| 2.35195986941647008E17  |
+-------------------------+
1 row selected (0.264 seconds)
{code}

Explain plan for above query from Drill

{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for SELECT  avg(lead_col1) FROM (SELECT LEAD(col1)
OVER(PARTITION BY col7 ORDER BY col1) lead_col1 , col7 FROM FEWRWSPQQ_101) sub_query;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), $1)):ANY NOT NULL])
00-03          StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
00-04            Project(w0$o0=[$3])
00-05              Window(window#0=[window(partition {2} order by [1] range between UNBOUNDED
PRECEDING and CURRENT ROW aggs [LEAD($1)])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$2], sort1=[$1], dir0=[ASC], dir1=[ASC])
00-08                    Project(T36¦¦*=[$0], col1=[$1], col7=[$2])
00-09                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/FEWRWSPQQ_101]],
selectionRoot=maprfs:/tmp/FEWRWSPQQ_101, numFiles=1, columns=[`*`]]])
{code}

results returned by Postgres
{code}
postgres=# SELECT  avg(lead_col1) FROM (SELECT LEAD(col1) OVER(PARTITION BY col7 ORDER BY
col1) lead_col1 , col7 FROM FEWRWSPQQ_101) sub_query;
         avg         
---------------------
 1157533190627124568
(1 row)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message