pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy" <rohini.adi...@gmail.com>
Subject Re: Review Request 32868: PIG-4495: Better multi-query planning in case of multiple edges
Date Mon, 06 Apr 2015 21:01:56 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32868/
-----------------------------------------------------------

(Updated April 6, 2015, 9:01 p.m.)


Review request for pig and Daniel Dai.


Changes
-------

Updated with tests and fixed issues found with new tests.


Bugs: PIG-4495
    https://issues.apache.org/jira/browse/PIG-4495


Repository: pig


Description (updated)
-------

This patch basically gets rid of the need for the ask TEZ-1190 Allow multiple edges between
two vertexes. 

Changes done:
   1) Case of Self join/cross/cogroup
        - Multiple sub-plans of split write to the same output. The POShuffleTezLoad is now
capable of splitting the input into correct bags based on the index in the key.
        - Do not allow cases like self-replicate/self-skewed join
   2) Case of union
        - Multiple sub-plans of split write to the same output and connect to the vertex group.
If only sub-plans of the split are members of the union, then no vertex group is created and
split is directly connected to union successors. 
        - For cases like nightly.conf Union_16.pig (moved to multiquery.conf now) which has
multiple levels of union all from same split, even the vertex group created is removed and
all the split sub-plans write directly to the successor.
   3) Other optimizations done
        - If there was a union followed by replicate join it was not optimized (PIG-3856).
But if the union is within the same split we now broadcast the replicate join once to the
split operator.
   4) Refactored code in UnionOptimizer into methods for easy readability.
   5) Not very related, but cleaned up TestMultiQueryLocal as had to search the logs for exception
logged while testing this patch instead of being able to look at the junit test failure stacktrace.


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPoissonSample.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperator.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPOPackageAnnotator.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POFRJoinTez.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POPartitionRearrangeTez.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POShuffleTezLoad.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POStoreTez.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POValueOutputTez.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/MultiQueryOptimizerTez.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezOperDependencyParallelismEstimator.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/UnionOptimizer.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/multiquery.conf 1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/nightly.conf 1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMultiQueryLocal.java
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Cross-1.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Cross-2.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2-OPTOFF.gld
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2.gld
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-1.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-2.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-3.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-1.gld
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-11-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-11.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-12-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-12.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-13-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-13.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-14-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-14.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-15-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-15.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16-OPTOFF.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16.gld
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9-OPTOFF.gld
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9.gld
1671263 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1671263


Diff: https://reviews.apache.org/r/32868/diff/


Testing (updated)
-------

New tests added. Running full test suite now


Thanks,

Rohini Palaniswamy


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message