hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Help: How to increase amont maptasks per job ?
Date Fri, 07 Jan 2011 21:47:26 GMT
Check out mapred.map.tasks and mapred.reduce.tasks

On Fri, Jan 7, 2011 at 1:40 PM, Tali K <ncherryus@hotmail.com> wrote:

>
> According to the documentation, that parameter is for the number of
>    tasks *per TaskTracker*.  I am asking about the number of tasks
>    for the entire job and entire cluster.  That parameter is already
>    set to 3, which is one less than the number of cores on each node's
>    CPU, as recommended.In my question I stated   that
>    82 tasks were run for the first job, yet only 4 for the second -
>    both numbers being cluster-wide.
>
>
>
> > Date: Fri, 7 Jan 2011 13:19:42 -0800
> > Subject: Re: Help: How to increase amont maptasks per job ?
> > From: yuzhihong@gmail.com
> > To: common-user@hadoop.apache.org
> >
> > Set higher values for mapred.tasktracker.map.tasks.maximum (and
> > mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
> >
> > On Fri, Jan 7, 2011 at 12:58 PM, Tali K <ncherryus@hotmail.com> wrote:
> >
> > >
> > >
> > >
> > >
> > > We have a jobs which runs in several map/reduce stages.  In the first
> job,
> > > a large number of map tasks -82  are initiated, as expected.
> > > And that cause all nodes to be used.
> > >  In a
> > > later job, where we are still dealing with large amounts of
> > >  data, only 4 map tasks are initiated, and that caused to use only 4
> nodes.
> > > This stage is actually the
> > > workhorse of the job, and requires much more processing power than the
> > > initial stage.
> > >  We are trying to understand why only a few map tasks are
> > > being used, as we are not getting the full advantage of our cluster.
> > >
> > >
> > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message