hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-2340) optimize orderby followed by a groupby
Date Wed, 13 Feb 2013 07:58:15 GMT

     [ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HIVE-2340:
------------------------------

    Attachment: HIVE-2340.D1209.12.patch

navis updated the revision "HIVE-2340 [jira] optimize orderby followed by a groupby".

  1. Changed policy of creating new metadatas(colExprMap, etc) in ColumnPrunerProcFactory.pruneReduceSinkOperator()
  - Remove not retained values from RowResolver, colExprMap and schema (instead of creating
new entities by adding retained values)
  2. Changed order of applying CP and PPD. Now PPD applies first and CP next (which was CP-PPD)
  - CP removes some expr mappings which was not yet propagated by PPD
  - Also removed pruning schema of FilterOperator, which seemed not right (It's not certain
that TS will actually prune columns)
  3. Refactored to share same code base in ExprNodeDescUtils which was introduced by HIVE-2839

  Will run full test tonight

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D1209

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D1209?vs=27315&id=27669#toc

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  conf/hive-default.xml.template
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
  ql/src/test/queries/clientpositive/auto_join26.q
  ql/src/test/queries/clientpositive/groupby_distinct_samekey.q
  ql/src/test/queries/clientpositive/reduce_deduplicate.q
  ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q
  ql/src/test/results/clientpositive/cluster.q.out
  ql/src/test/results/clientpositive/groupby2.q.out
  ql/src/test/results/clientpositive/groupby2_map_skew.q.out
  ql/src/test/results/clientpositive/groupby_cube1.q.out
  ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out
  ql/src/test/results/clientpositive/groupby_rollup1.q.out
  ql/src/test/results/clientpositive/index_bitmap3.q.out
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out
  ql/src/test/results/clientpositive/infer_bucket_sort.q.out
  ql/src/test/results/clientpositive/ppd2.q.out
  ql/src/test/results/clientpositive/ppd_gby_join.q.out
  ql/src/test/results/clientpositive/reduce_deduplicate.q.out
  ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out
  ql/src/test/results/clientpositive/semijoin.q.out
  ql/src/test/results/clientpositive/union24.q.out
  ql/src/test/results/compiler/plan/input2.q.xml
  ql/src/test/results/compiler/plan/input3.q.xml
  ql/src/test/results/compiler/plan/join1.q.xml
  ql/src/test/results/compiler/plan/join2.q.xml
  ql/src/test/results/compiler/plan/join3.q.xml
  ql/src/test/results/compiler/plan/sample1.q.xml
  ql/src/test/results/compiler/plan/sample2.q.xml
  ql/src/test/results/compiler/plan/sample3.q.xml
  ql/src/test/results/compiler/plan/sample4.q.xml
  ql/src/test/results/compiler/plan/sample5.q.xml
  ql/src/test/results/compiler/plan/sample6.q.xml
  ql/src/test/results/compiler/plan/sample7.q.xml

To: JIRA, navis
Cc: hagleitn, njain

                
> optimize orderby followed by a groupby
> --------------------------------------
>
>                 Key: HIVE-2340
>                 URL: https://issues.apache.org/jira/browse/HIVE-2340
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>              Labels: perfomance
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch,
ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch,
ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.12.patch, HIVE-2340.13.patch,
HIVE-2340.1.patch.txt, HIVE-2340.D1209.10.patch, HIVE-2340.D1209.11.patch, HIVE-2340.D1209.12.patch,
HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch,
testclidriver.txt
>
>
> Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by
following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message