hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tali K <ncherr...@hotmail.com>
Subject RE: Help: How to increase amont maptasks per job ?
Date Fri, 07 Jan 2011 21:40:38 GMT

According to the documentation, that parameter is for the number of
    tasks *per TaskTracker*.  I am asking about the number of tasks
    for the entire job and entire cluster.  That parameter is already
    set to 3, which is one less than the number of cores on each node's
    CPU, as recommended.In my question I stated   that
    82 tasks were run for the first job, yet only 4 for the second -
    both numbers being cluster-wide.

    

> Date: Fri, 7 Jan 2011 13:19:42 -0800
> Subject: Re: Help: How to increase amont maptasks per job ?
> From: yuzhihong@gmail.com
> To: common-user@hadoop.apache.org
> 
> Set higher values for mapred.tasktracker.map.tasks.maximum (and
> mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
> 
> On Fri, Jan 7, 2011 at 12:58 PM, Tali K <ncherryus@hotmail.com> wrote:
> 
> >
> >
> >
> >
> > We have a jobs which runs in several map/reduce stages.  In the first job,
> > a large number of map tasks -82  are initiated, as expected.
> > And that cause all nodes to be used.
> >  In a
> > later job, where we are still dealing with large amounts of
> >  data, only 4 map tasks are initiated, and that caused to use only 4 nodes.
> > This stage is actually the
> > workhorse of the job, and requires much more processing power than the
> > initial stage.
> >  We are trying to understand why only a few map tasks are
> > being used, as we are not getting the full advantage of our cluster.
> >
> >
> >
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message