hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish Butani (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-10586) Plans for Queries with Select distinct and Windowing are incorrect
Date Sun, 03 May 2015 16:21:06 GMT
Harish Butani created HIVE-10586:
------------------------------------

             Summary: Plans for Queries with Select distinct and Windowing are incorrect
                 Key: HIVE-10586
                 URL: https://issues.apache.org/jira/browse/HIVE-10586
             Project: Hive
          Issue Type: Bug
          Components: PTF-Windowing, Query Planning
            Reporter: Harish Butani


Thanks to [~yhuai] for pointing this out.

The Plan generated has the GBy Operator(for the Select Distinct) placed below the PTFOp. One
would expect the Select Distinct to happen last. [~yhuai] confirmed this behavior in postgres.
I think this paragraph in the SQL spec states this order(though I am not an expert in deciphering
the language in the spec; if an expert on the spec wants to pipe in, please do):
{noformat}
Point h) on Page 222,  in the 2011 SQL Spec, seems to state this:

h)  Case:

i)  If OF is simply contained in a <query specification> QSX, then QSX is equivalent
to:

SELECT SQ SLNEW TENEW
{noformat} 

Here is an example from windowing.q
{noformat}
35. testDistinctWithWindowing
select DISTINCT p_mfgr, p_name, p_size,
sum(p_size) over w1 as s
from part
window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding and 2 following)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message