pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "PigMultiQueryPerformanceSpecification" by RichardDing
Date Mon, 04 May 2009 17:05:50 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by RichardDing:
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

------------------------------------------------------------------------------
  
  Will be executed as:
  
- attachment:mapreduce.png
+ [TBD]
  
  If a split happens in a reduce plan, splittees have to be map-only jobs to be merged into
the splitter.
  If there are map-reduce splittees the reduce will result in a tmp store and the splittees
are run in separate
@@ -570, +570 @@

  The merging of splittees into a splitter consists of:
  
     * Creating a split operator in the map or reduce and setting the splittee plans as nested
plans of the split
-    * If it needs to merge combiners it will introduce a Demux operator to route the input
from mixed split branches in the mapper to the right combine plan. The separate combiner plans
are the nested plans of the Demux operator
+    * If it needs to merge combiners it will introduce a Demux operator to route the input
from mixed split branches in the mapper to the right combine plan. The separate combiner plans
are the nested plans of the Demux operator   
-    * If a map reduce operator does not have a combiner it will insert a FakeLocalRearrange
operator to simply route the input through.
     * If it needs to merge reduce plans, it will do so using the Demux operator the same
way the combiner is merged.
+    * In the cases where some splittees have combiners and some do not have combiners, the
optimizer chooses either the subset of splittees with combiners or the subset of splittees
without combiners--depending on which subset is larger--and merges these splittees into the
splitter.
  
  Note: As an end result this merging will result in Split or Demux operators with multiple
stores tucked away in their nested plans.
  
@@ -636, +636 @@

  [[Anchor(DemuxOperator)]]
  ===== Demux Operator =====
  
- The demux operator is used in combiners and reducers where the input is a mix of different
split plans of the mapper. It will decide which of it's nested plans a record belongs to and
then attach it to that particular plan.
+ The demux operator is used in combiners and reducers where the input is a mix of different
split plans of the mapper. The outputs of split plans are indexed and based on the index,
the demux operator will decide which of it's nested plans a record belongs to and then attach
it to that particular plan. 
+ 
  
  [[Anchor(Local_Execution_engine)]]
  ==== Local Execution Engine ====

Mime
View raw message