Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 811A510663 for ; Sun, 3 May 2015 16:21:06 +0000 (UTC) Received: (qmail 96829 invoked by uid 500); 3 May 2015 16:21:06 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 96754 invoked by uid 500); 3 May 2015 16:21:06 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 96743 invoked by uid 99); 3 May 2015 16:21:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 May 2015 16:21:06 +0000 Date: Sun, 3 May 2015 16:21:06 +0000 (UTC) From: "Harish Butani (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-10586) Plans for Queries with Select distinct and Windowing are incorrect MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Harish Butani created HIVE-10586: ------------------------------------ Summary: Plans for Queries with Select distinct and Windowing are incorrect Key: HIVE-10586 URL: https://issues.apache.org/jira/browse/HIVE-10586 Project: Hive Issue Type: Bug Components: PTF-Windowing, Query Planning Reporter: Harish Butani Thanks to [~yhuai] for pointing this out. The Plan generated has the GBy Operator(for the Select Distinct) placed below the PTFOp. One would expect the Select Distinct to happen last. [~yhuai] confirmed this behavior in postgres. I think this paragraph in the SQL spec states this order(though I am not an expert in deciphering the language in the spec; if an expert on the spec wants to pipe in, please do): {noformat} Point h) on Page 222, in the 2011 SQL Spec, seems to state this: h) Case: i) If OF is simply contained in a QSX, then QSX is equivalent to: SELECT SQ SLNEW TENEW {noformat} Here is an example from windowing.q {noformat} 35. testDistinctWithWindowing select DISTINCT p_mfgr, p_name, p_size, sum(p_size) over w1 as s from part window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding and 2 following) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)