hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2056) Generate single MR job for multi groupby query.
Date Mon, 09 May 2011 17:09:03 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030814#comment-13030814
] 

jiraposter@reviews.apache.org commented on HIVE-2056:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/700/#review651
-----------------------------------------------------------


Change hive-default.xml with the new parameter.
Add the new parameter in the name of the jira.


trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/700/#comment1306>

    Add a comment - this optimization is not enabled
    if one of the sub-queries does not involve a 
    aggregation



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/700/#comment1307>

    The code is not preforming a prefix match.
    I mean,
    
    if the query is:
    
    from T
    insert overwrite T1 select ... group by c1
    insert overwrite T1 select ... group by c2, c1
    
    
    c1 will still be returned.
    
    Is that desirable ?
    
    I dont think this will work - can you add a testcase
    for this - I mean, with a explain which shows that
    the parameter does not make a difference
    


- namit


On 2011-05-09 13:36:28, Amareshwari Sriramadasu wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/700/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-05-09 13:36:28)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Attached patch generates a single M/R job for multi group by query with non-null common
group by key set. Added configuration hive.multigroupby.singlemr to turn on and off the optimization.

bq.  
bq.  
bq.  This addresses bug HIVE-2056.
bq.      https://issues.apache.org/jira/browse/HIVE-2056
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1100910 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1100910 
bq.    trunk/ql/src/test/queries/clientpositive/groupby10.q 1100910 
bq.    trunk/ql/src/test/queries/clientpositive/groupby8.q 1100910 
bq.    trunk/ql/src/test/queries/clientpositive/groupby8_noskew.q 1100910 
bq.    trunk/ql/src/test/queries/clientpositive/groupby9.q 1100910 
bq.    trunk/ql/src/test/results/clientpositive/groupby10.q.out 1100910 
bq.    trunk/ql/src/test/results/clientpositive/groupby8.q.out 1100910 
bq.    trunk/ql/src/test/results/clientpositive/groupby9.q.out 1100910 
bq.  
bq.  Diff: https://reviews.apache.org/r/700/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Updated jira with performance tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amareshwari
bq.  
bq.



> Generate single MR job for multi groupby query.
> -----------------------------------------------
>
>                 Key: HIVE-2056
>                 URL: https://issues.apache.org/jira/browse/HIVE-2056
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.8.0
>
>         Attachments: patch-2056.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message