drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1487) Drill window functions return wrong results
Date Tue, 14 Oct 2014 00:03:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170266#comment-14170266
] 

Timothy Chen commented on DRILL-1487:
-------------------------------------

Hi [~jni], so if I understand correctly you're saying that when order by is added I need to
include all rows in the same order by value computing the aggregation function that matches
the partition by column? 

In other words, with dataset
employee id | position_id | salary
   1                         1                  4
   2                         1                  4
   3                         2                  2

The sum with partition by and order by on position id returns:

1                       1                 8
2                       1                 8
3                        2                 2

instead of 

1                       1                  4
2                        1                 8
3                        2                 2

Where the 2nd table is a sliding window sum?

And also what happens when we add configurable offsets?



> Drill window functions return wrong results
> -------------------------------------------
>
>                 Key: DRILL-1487
>                 URL: https://issues.apache.org/jira/browse/DRILL-1487
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Neeraja
>            Assignee: Timothy Chen
>
> Executing the following window function with the requirement to see how a given employee
salary would compare to the avg(salary) in his/her position. 
> Query executes fine however returns wrong results(expect the avg(salary) to stay same
for a given window (i.e position id)
> 0: jdbc:drill:zk=local> SELECT employee_id,position_id, salary, avg(salary) OVER (PARTITION
BY position_id order by position_id) FROM cp.`employee.json` order by employee_id;
> +-------------+-------------+------------+------------+
> | employee_id | position_id |   salary   |   EXPR$3   |
> +-------------+-------------+------------+------------+
> | 1           | 1           | 80000.0    | 80000.0    |
> | 2           | 2           | 40000.0    | 37500.0    |
> | 4           | 2           | 40000.0    | 38333.333333333336 |
> | 5           | 2           | 35000.0    | 35000.0    |
> | 6           | 3           | 25000.0    | 25000.0    |
> | 7           | 4           | 15000.0    | 15000.0    |
> | 8           | 11          | 10000.0    | 14333.333333333334 |
> | 9           | 11          | 17000.0    | 17000.0    |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message