hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai" <h...@cse.ohio-state.edu>
Subject Re: Review Request: HIVE-2206: add a new optimizer for query correlation discovery and optimization
Date Mon, 24 Sep 2012 14:33:15 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7126/
-----------------------------------------------------------

(Updated Sept. 24, 2012, 2:33 p.m.)


Review request for hive.


Changes
-------

bug fix + 2 new tests


Description
-------

This optimizer exploits intra-query correlations and merges multiple correlated MapReduce
jobs into one jobs. Open a new request since I have been working on hive-git.


This addresses bug HIVE-2206.
    https://issues.apache.org/jira/browse/HIVE-2206


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2693663 
  ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 8669051 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 5f08519 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 919a140 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1469325 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 40dd949 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java f292131 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 33ce6ca 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040 
  ql/src/test/queries/clientpositive/correlationoptimizer1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer5.q PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer4.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer5.q.out PRE-CREATION 
  ql/src/test/results/compiler/plan/groupby1.q.xml 4382252 
  ql/src/test/results/compiler/plan/groupby2.q.xml eef669c 
  ql/src/test/results/compiler/plan/groupby3.q.xml 9743480 
  ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860 

Diff: https://reviews.apache.org/r/7126/diff/


Testing
-------

Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver, TestHBaseNegativeCliDriver, testSynchronized
in TestEmbeddedHiveMetaStore, testSynchronized in TestRemoteHiveMetaStore, testSynchronized
in TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient, testSynchronized
in TestSetUGIOnOnlyServer, and testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver,
since trunk failed on these tests on my machine. Also, since trunk will generate a different
order of results (rows are in a different order) for queries skewjoinopt1.q to skewjoinopt5.q,
skewjoinopt10.q, skewjoinopt15.q to skewjoinopt17.q, and skewjoinopt19.q to skewjoinopt20.q,
I cannot test these queries on my machine either. All other tests pass.


Thanks,

Yin Huai


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message