Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 01F011814E for ; Tue, 16 Jun 2015 18:31:01 +0000 (UTC) Received: (qmail 70564 invoked by uid 500); 16 Jun 2015 18:31:00 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 70537 invoked by uid 500); 16 Jun 2015 18:31:00 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 70527 invoked by uid 99); 16 Jun 2015 18:31:00 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Jun 2015 18:31:00 +0000 Date: Tue, 16 Jun 2015 18:31:00 +0000 (UTC) From: "Deneche A. Hakim (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-3298) Wrong result with SUM window function and order by without partition by in the OVER clause MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588525#comment-14588525 ] Deneche A. Hakim commented on DRILL-3298: ----------------------------------------- [~amansinha100] can you take a look at this plan (when slice_target = 1) is it correct ? looks like the sort and window function are applied in parallel on the incoming data, if this is the case then the window function will compute wrong results: {noformat} 00-01 Project(c_integer=[$0], EXPR$1=[$1]) 00-02 SingleMergeExchange(sort0=[0 ASC], sort1=[1 ASC]) 01-01 SelectionVectorRemover 01-02 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) 01-03 Project(c_integer=[$0], EXPR$1=[$1]) 01-04 HashToRandomExchange(dist0=[[$0]], dist1=[[$1]]) 02-01 UnorderedMuxExchange 03-01 Project(c_integer=[$0], EXPR$1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1, hash64AsDouble($0)))]) 03-02 Project(c_integer=[$0], EXPR$1=[CASE(>($1, 0), CAST($2):ANY, null)]) 03-03 Window(window#0=[window(partition {} order by [0 DESC] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])]) 03-04 SelectionVectorRemover 03-05 Sort(sort0=[$0], dir0=[DESC]) 03-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j1]], selectionRoot=/drill/testdata/subqueries/j1, numFiles=1, columns=[`c_integer`]]]) {noformat} > Wrong result with SUM window function and order by without partition by in the OVER clause > ------------------------------------------------------------------------------------------ > > Key: DRILL-3298 > URL: https://issues.apache.org/jira/browse/DRILL-3298 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Affects Versions: 1.0.0 > Reporter: Victoria Markman > Assignee: Chris Westin > Priority: Critical > Labels: window_function > Attachments: j1.tar, test.res > > > This query returns incorrect result when planner.slice_target = 1 > {code} > select > j1.c_integer, > sum(j1.c_integer) over w > from j1 > window w as (order by c_integer desc) > order by > 1, 2; > {code} > Query plan with planner.slice_target = 1 > {noformat} > 00-01 Project(c_integer=[$0], EXPR$1=[$1]) > 00-02 SingleMergeExchange(sort0=[0 ASC], sort1=[1 ASC]) > 01-01 SelectionVectorRemover > 01-02 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > 01-03 Project(c_integer=[$0], EXPR$1=[$1]) > 01-04 HashToRandomExchange(dist0=[[$0]], dist1=[[$1]]) > 02-01 UnorderedMuxExchange > 03-01 Project(c_integer=[$0], EXPR$1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1, hash64AsDouble($0)))]) > 03-02 Project(c_integer=[$0], EXPR$1=[CASE(>($1, 0), CAST($2):ANY, null)]) > 03-03 Window(window#0=[window(partition {} order by [0 DESC] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])]) > 03-04 SelectionVectorRemover > 03-05 Sort(sort0=[$0], dir0=[DESC]) > 03-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j1]], selectionRoot=/drill/testdata/subqueries/j1, numFiles=1, columns=[`c_integer`]]]) > {noformat} > Query plan with planner.slice_target = 100000; > {noformat} > 00-01 Project(c_integer=[$0], EXPR$1=[$1]) > 00-02 SelectionVectorRemover > 00-03 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > 00-04 Project(c_integer=[$0], EXPR$1=[CASE(>($1, 0), CAST($2):ANY, null)]) > 00-05 Window(window#0=[window(partition {} order by [0 DESC] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])]) > 00-06 SelectionVectorRemover > 00-07 Sort(sort0=[$0], dir0=[DESC]) > 00-08 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j1]], selectionRoot=/drill/testdata/subqueries/j1, numFiles=1, columns=[`c_integer`]]]) > {noformat} > Attached: > * table j1 > * test.res - result generated with postgres -- This message was sent by Atlassian JIRA (v6.3.4#6332)