hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <charmal...@allthingshadoop.com>
Subject Re: what affects number of reducers launched by hadoop?
Date Wed, 28 Jul 2010 19:40:31 GMT
mapred.tasktracker.reduce.tasks.maximum is how many you want as a ceiling
per node

you need to configure *mapred.reduce.tasks* to be more than one as it is
defaulted to 1 (which you are overriding in your code which is why it works
there)

This value should be somewhere between .95 and 1.75 times the number of
maximum tasks per node times the number of data nodes.

So if you have 3 data nodes and it is setup max tasks of 7 then configure
this between 25 and 36

On Wed, Jul 28, 2010 at 3:24 PM, Vitaliy Semochkin <vitaliy.se@gmail.com>wrote:

> Hi,
>
> in my cluster mapred.tasktracker.reduce.tasks.maximum = 4
> however during monitoring the job in job tracker I see only 1 reducer
> working
>
> first it is
> reduce > copy - can someone please explain what does this mean?
>
> after it is
> reduce > reduce
>
> when I set the number of reduce tasks for a job programatically to 10
> job.setNumReduceTasks(10);
> the number of "reduce > reduce" reducers increases to 10 and the
> performance of application increases as well (the number of reducers
> never exceeds).
>
> Can someone explain such behavior?
>
> Thanks in Advance,
> Vitaliy S
>



-- 

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop
*/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message