pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park" <piaozhe...@gmail.com>
Subject Re: Review Request 23787: Group All followed by CROSS with default parallelism produces wrong results
Date Fri, 25 Jul 2014 18:36:23 GMT


> On July 25, 2014, 6:28 p.m., Rohini Palaniswamy wrote:
> > trunk/src/org/apache/pig/PigConfiguration.java, line 274
> > <https://reviews.apache.org/r/23787/diff/4/?file=641632#file641632line274>
> >
> >     It is a internal setting and not user facing one. We should probably create
a new class called PigInternalConfiguration for those. 
> >     
> >     Can also remove "hint" from the name as it is used as is.

We have PigConstants fot internal configurations. We can use that, no?


- Cheolsoo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23787/#review48747
-----------------------------------------------------------


On July 25, 2014, 1:36 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23787/
> -----------------------------------------------------------
> 
> (Updated July 25, 2014, 1:36 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4057
>     https://issues.apache.org/jira/browse/PIG-4057
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Summary of changes:
> 1. Take tez parallelism estimation out from TezDagBuilder to ParallelismSetter, so we
can get estimated parallelism of the cross before we creating vertex of GFCross
> 2. Take InputSplit generate out from TezDagBuilder to LoaderProcessor, since we need
to know the parallelism of maps before ParallelismSetter
> 3. set pig.cross.parallelism.hint.(operator_key) in conf
>     * In tez, this is done when we encounter cross vertex
>     * In MR, this is done when we encounter the first GFCross
> 4. GFCross will use pig.cross.parallelism.hint.(operator_key) to determine the #partition
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/PigConfiguration.java 1613328 
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
1613328 
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POGlobalRearrange.java
1613328 
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1613328

>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1613328

>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1613328

>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1613328

>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
PRE-CREATION 
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java
PRE-CREATION 
>   trunk/src/org/apache/pig/impl/builtin/GFCross.java 1613328 
>   trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
1613328 
>   trunk/test/e2e/pig/tests/nightly.conf 1613328 
>   trunk/test/org/apache/pig/test/TestGFCross.java 1613328 
> 
> Diff: https://reviews.apache.org/r/23787/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message