hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: When a Reduce Task starts?
Date Tue, 21 Dec 2010 06:43:15 GMT
On Tue, Dec 21, 2010 at 7:23 AM, li ping <li.j2ee@gmail.com> wrote:
> I think the reduce can be started before all of the map finished.
> See the configration item in mapred-site.xml
> <property>
>   <name>mapred.reduce.slowstart.completed.maps</name>
>   <value>0.05</value>
>   <description>Fraction of the number of maps in the job which should be
>   complete before reduces are scheduled for the job.
>   </description>
> </property>
> Correct me, if I'm wrong.

Well it depends on what you mean by a "reduce". A ReduceTask, in
Hadoop terms, may begin as some maps complete (as configured using
mapred.reduce.slowstart.completed.maps) -- but they would only be in
the Copy phase (Not sort/reduce).

With the current Hadoop implementation, a reduce(Key, Iterable<Value>)
will never be called until all mappers have completed.

Harsh J

View raw message