hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhinay Mehta <abhinay.me...@gmail.com>
Subject Re: what affects number of reducers launched by hadoop?
Date Thu, 29 Jul 2010 10:31:31 GMT
Which configuration key controls "the number of maximum tasks per node" ?


On 28 July 2010 20:40, Joe Stein <charmalloc@allthingshadoop.com> wrote:

> mapred.tasktracker.reduce.tasks.maximum is how many you want as a ceiling
> per node
>
> you need to configure *mapred.reduce.tasks* to be more than one as it is
> defaulted to 1 (which you are overriding in your code which is why it works
> there)
>
> This value should be somewhere between .95 and 1.75 times the number of
> maximum tasks per node times the number of data nodes.
>
> So if you have 3 data nodes and it is setup max tasks of 7 then configure
> this between 25 and 36
>
> On Wed, Jul 28, 2010 at 3:24 PM, Vitaliy Semochkin <vitaliy.se@gmail.com
> >wrote:
>
> > Hi,
> >
> > in my cluster mapred.tasktracker.reduce.tasks.maximum = 4
> > however during monitoring the job in job tracker I see only 1 reducer
> > working
> >
> > first it is
> > reduce > copy - can someone please explain what does this mean?
> >
> > after it is
> > reduce > reduce
> >
> > when I set the number of reduce tasks for a job programatically to 10
> > job.setNumReduceTasks(10);
> > the number of "reduce > reduce" reducers increases to 10 and the
> > performance of application increases as well (the number of reducers
> > never exceeds).
> >
> > Can someone explain such behavior?
> >
> > Thanks in Advance,
> > Vitaliy S
> >
>
>
>
> --
>
> /*
> Joe Stein
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop
> */
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message