pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park" <piaozhe...@gmail.com>
Subject Re: Review Request 15194: Support multiple inputs for PigProcessor
Date Sun, 03 Nov 2013 17:41:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15194/#review28079
-----------------------------------------------------------



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/POSimpleTezLoad.java
<https://reviews.apache.org/r/15194/#comment54640>

    This breaks compilation with Hadoop 1.x.
    
    i.e. -Dhadoopversion=20
    
    We need to move it under tez package.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java
<https://reviews.apache.org/r/15194/#comment54641>

    This reverts PIG-3060 and breaks TestEvalPipelineLocal.testFlattenEmptyBag.
    
    https://issues.apache.org/jira/browse/PIG-3060


- Cheolsoo Park


On Nov. 2, 2013, 1:17 a.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15194/
> -----------------------------------------------------------
> 
> (Updated Nov. 2, 2013, 1:17 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3527
>     https://issues.apache.org/jira/browse/PIG-3527
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Adds support for multiple LogicalInputs to the PigProcessor. This is done by adding a
new TezLoad interface which PhysicalOperators may implement. On the backend, any operators
implementing this interface will have the LogicalInput attached to them. 2 implementations
are included:
> * POSimpleTezLoad which consumes a single MRInput
> * POShuffleTezLoad which consumes one or more ShuffledMergedInputs.
> The POShuffleTezLoad does a k-way merge of the shuffle inputs to package for the operator
pipeline. This required a change to the comparators used so that the sort order remained consistent.
There is also a fix to POForEach where it was using the incorrect status code for signaling
(although it produced the same end result in the MR pipeline).
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigDecimalRawComparator.java
ddea99e 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigIntegerRawComparator.java
5ea3fc7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBooleanRawComparator.java
dfd4ebf 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java
09397e5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDateTimeRawComparator.java
a87161f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDoubleRawComparator.java
cbf457f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigFloatRawComparator.java
1d86e3f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigIntRawComparator.java
bb6c9df 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigLongRawComparator.java
b3ded76 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigSecondaryKeyComparator.java
5ad334b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTextRawComparator.java
022f37b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java
866c39d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleSortComparator.java
9724b9f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/POSimpleTezLoad.java
PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/TezLoad.java PRE-CREATION

>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java
eb9f62a 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java
c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java d29e330

>   src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java d2298ca 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java PRE-CREATION

>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java ebb3145 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java d7b42b8

>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 45e47b0 
>   src/org/apache/pig/data/BinInterSedes.java b3ec51e 
>   src/org/apache/pig/data/DefaultTuple.java 2e7ca5f 
>   test/e2e/pig/tests/tez.conf 24af8d3 
> 
> Diff: https://reviews.apache.org/r/15194/diff/
> 
> 
> Testing
> -------
> 
> Manual testing and an e2e test has been added. Because of the comparator change, some
of the tests fail because of bag ordering.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message