hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Faraz Ahmad (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-2083) Run partial reduce instead of combiner at reduce node to overlap shuffle delay with reduce
Date Wed, 22 Sep 2010 16:47:35 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Faraz Ahmad updated MAPREDUCE-2083:

    Summary: Run partial reduce instead of combiner at reduce node to overlap shuffle delay
with reduce  (was: Run partial reduce instead of combiner at reduce node)

> Run partial reduce instead of combiner at reduce node to overlap shuffle delay with reduce
> ------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-2083
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2083
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Faraz Ahmad
>             Fix For: 0.20.2
> Shuffle delays can be large for mapreductions with lots of intermediate data. Some of
this shuffle delay can be overlapped with reduce if some of the reduce computation is started
on partial intermediate data received by a reduce. Along these lines, the patch ??HADOOP-3226??
runs the combiner on the reduce side to prune the data that goes to reduce. However, ??HADOOP-3226??
does not achieve our goal of overlap with the shuffle because: 
> (1) In its original use of reducing intermediate data volume, the combiner falls in the
critical path at the map side. Therefore, the combiner is usually a simple function which
is too  lightweight in its new use to achieve sufficient overlap with the shuffle. 
> (2) Running the combiner  at the reduce side is helpful in overlapping with the shuffle
only if  the combiner's functionality is a major portion of the reduce functionality --  otherwise
running the combiner at the reduce side achieves only modest overlap with the shuffle. In
many mapreductions, the combiner computation is often not part or only a small part of reduce
computation. Addressing both these points, reduces that are complex often have heavier-weight
computation than simple combining that can be overlapped with the shuffle. This heavy-weight
computation is specified by a user-supplied "partial reduce" which performs the commutative/associative
parts of reduce. The idea is to run partial reduce on subsets of intermediate data as they
arrive at a reduce to  overlap with the shuffle, and then run the full-blown final reduce
which re-reduces the partially-reduced data. Because the shuffle delay is large  for shuffle-heavy
mapreductions, partial reduce that are heavier-weight than simple combiner can be hidden under
the shuffle delay without extending the critical path of execution. 
> Finally, to further ensure that the partial reduce does not extend the critical path,
we need to include two easily-tunable thresholds: One to start partial reduce only after enough
intermediate data has been received (e.g. use mapred.inmem.merge.threshold or a separately
defined parameter) so that we do not incur the overhead of invoking partial reduce on small
data. Another threshold to stop partial reduce after most of the intermediate data has been
received so that running partial reduce on the small remainder data does not delay starting
final reduce.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message