hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by
Date Thu, 07 Mar 2013 19:59:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596280#comment-13596280
] 

Ashutosh Chauhan commented on HIVE-4108:
----------------------------------------

I think we should remove the concept of inference from query level distribute/sort.
For your first query I will read that as user first intends to do a partition on a constant
using full table (which will be first MR job) and than wants second partitioning on x (2nd
MR job) which as you pointed out is different than current behavior.
For your second query, my read will be same as previous which again deviates from implementation.
For third query, same ambiguity.

So, in all 3 cases current behavior is different than what I would have expected. Automatic
inference is nasty. IMO we should drop it all together. Distribute/Sort if present in query
shouldn't impact any over() clause specified in the query. Whenever they are present that
will just imply user wants another MR job using that spec (which was the behavior in HIVE
before this work).
                
> Allow over() clause to contain an order by with no partition by
> ---------------------------------------------------------------
>
>                 Key: HIVE-4108
>                 URL: https://issues.apache.org/jira/browse/HIVE-4108
>             Project: Hive
>          Issue Type: Bug
>          Components: PTF-Windowing
>            Reporter: Brock Noland
>
> HIVE-4073 allows over() to be called with no partition by and no order by. We should
allow only an order by.
> From the review of HIVE-4073:
> Ashutosh
> {noformat}
> Can you also add following test. This should also work.
> select p_name, p_retailprice,
> avg(p_retailprice) over(order by p_name)
> from part
> partition by p_name;
> {noformat}
> Harish
> {noformat}
> This test will not work (:
> The grammar needs to be changed so:
> partitioningSpec
> @init { msgs.push("partitioningSpec clause"); }
> @after { msgs.pop(); } 
> :
> partitionByClause orderByClause? -> ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?)
|
> orderByClause -> ^(TOK_PARTITIONINGSPEC orderByClause) |
> distributeByClause sortByClause? -> ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?)
|
> sortByClause? -> ^(TOK_PARTITIONINGSPEC sortByClause) |
> clusterByClause -> ^(TOK_PARTITIONINGSPEC clusterByClause)
> ;
> And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST
Tree. The PTFTranslator also needs changes. Do this as another Jira
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message