pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy" <rohini.adi...@gmail.com>
Subject Review Request 31798: PIG-4443: Write inputsplits in Tez to disk if the size is huge and option to compress pig input splits
Date Fri, 06 Mar 2015 15:51:14 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31798/
-----------------------------------------------------------

Review request for pig and Daniel Dai.


Bugs: PIG-4443
    https://issues.apache.org/jira/browse/PIG-4443


Repository: pig


Description
-------

Patch adds two settings


1) pig.compress.input.splits
    This compresses the pig input split information if it is not a FileSplit. Compressing
FileSplit did not give much benefits. This can be turned on for HCatLoader till HIVE-9845
and TEZ-2144 are fixed. If TEZ-1244 is fixed, we can always turn this of for Tez as compressing
the whole payload will compress way better than compressing individual splits.
2) pig.tez.input.splits.mem.threshold
    Write input splits to disk in Tez if this threshold is hit. Default is 32MB which is half
of the default 64MB protobuf transfer limit.
    
This patch also has an additional change that removes MRJobConfig.MAPREDUCE_JOB_CREDENTIALS_BINARY
from tez payload as any API that calls TokenCache.obtainTokensForNamenodes on the task will
make it fail if pig was run via Oozie. This is because the value will be set to the credential
file path in the Oozie launcher job which will not be available on the tasks. This issue was
hit by hive running with Oozie.


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1664412

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigSplit.java
1664412 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1664412 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
1664412 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/MRToTezHelper.java
1664412 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java
1664412 

Diff: https://reviews.apache.org/r/31798/diff/


Testing
-------

Yes


Thanks,

Rohini Palaniswamy


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message