hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: When does Reduce job start
Date Tue, 04 Jan 2011 20:16:35 GMT

On Jan 4, 2011, at 10:53 AM, sagar naik wrote:
> 
> The only reason, I can think of not starting  a reduce task is to
> avoid the un-necessary transfer of map output data in case of
> failures.

	Reduce tasks also eat slots while doing the map output. On shared grids, this can be extremely
bad behavior.

> Is there a way to quickly start the reduce task in such case ?
> Wht is the configuration param to change this behavior

mapred.reduce.slowstart.completed.maps

See http://wiki.apache.org/hadoop/LimitingTaskSlotUsage (from the FAQ 2.12/2.13 questions).


Mime
View raw message