pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohini Palaniswamy <rohini.adi...@gmail.com>
Subject Re: Review Request 64844: PIG-5323: Implement LastInputStreamingOptimizer in Tez
Date Wed, 27 Dec 2017 02:40:23 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64844/
-----------------------------------------------------------

(Updated Dec. 27, 2017, 2:40 a.m.)


Review request for pig.


Repository: pig


Description
-------

http://pig.apache.org/docs/r0.17.0/perf.html#join-optimizations
>Optimization for regular joins ensures that the last table in the join is not brought
into memory but streamed through instead. Optimization reduces the amount of memory used which
means you can avoid spilling the data and also should be able to scale your query to larger
data volumes.
To take advantage of this optimization, make sure that the table with the largest number of
tuples per key is the last table in your query. In some of our tests we saw 10x performance
improvement as the result of this optimization.

We are not doing that in Tez and both the tables are materialized as InternalCachedBag.


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POShuffleTezLoad.java
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/LastInputStreamingOptimizerTez.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/EmptyDataBag.java PRE-CREATION

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/ReadOnceBag.java 1819291

  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-1-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-1.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-3-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-3.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-4-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-4.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-5-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-5.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-6-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-6.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-7-KeyToReducer.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-BloomJoin-7.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Cross-1.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Cross-2.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Cross-3.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Join-1.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-3.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-5.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-6.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-7.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-4-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-4.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-1.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-2.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-5.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SelfJoin-6.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-1.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-2.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-12-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-12.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-15-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-15.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-22.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-3-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-3.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6-OPTOFF.gld
1819291 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6.gld
1819291 


Diff: https://reviews.apache.org/r/64844/diff/1/


Testing (updated)
-------

All unit and e2e tests pass


Thanks,

Rohini Palaniswamy


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message