hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby
Date Wed, 30 Jan 2013 21:07:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566898#comment-13566898
] 

Phabricator commented on HIVE-2340:
-----------------------------------

hagleitn has commented on the revision "HIVE-2340 [jira] optimize orderby followed by a groupby".

  Partial review

INLINE COMMENTS
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:521 Not sure why this is needed
or why this defaults to 4. From comment below it seems this is just to avoid the single reducer
order-by case for performance reasons, is that correct?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:787 Is this
required or extra protection? Comment at the top of the file says mapjoin optimization happens
before this (and probably should for performance reasons). Also, if I understand it correctly
"joinAndSort" might be a better name than "fixed". You're basically saying that if an optimization
wants to change the join after this they need to make sure the ordering of the keys is preserved,
right?
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:136 seems orthogonal
to this patch.
  ql/src/test/queries/clientpositive/reduce_deduplicate.q:7 There are not a lot of tests,
for min.reducer=1. No order by case for instance. Maybe the reduce_deduplicate_extended.q
should run with both default and min.reducer=1.

REVISION DETAIL
  https://reviews.facebook.net/D1209

To: JIRA, navis
Cc: hagleitn

                
> optimize orderby followed by a groupby
> --------------------------------------
>
>                 Key: HIVE-2340
>                 URL: https://issues.apache.org/jira/browse/HIVE-2340
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>              Labels: perfomance
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch,
ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch,
ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.6.patch,
HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt
>
>
> Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by
following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message