hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Help: How to increase amont maptasks per job ?
Date Sat, 08 Jan 2011 04:12:43 GMT
It would depend on your input format. If the job is using an
InputFormat that does not let it split files, you would get only
mappers == no. of files. For splittable input files, you get mappers >
no. of files. Little more information on what the input format is
could help tracking down the problem a bit more.

On Sat, Jan 8, 2011 at 3:10 AM, Tali K <ncherryus@hotmail.com> wrote:
>
> According to the documentation, that parameter is for the number of
>    tasks *per TaskTracker*.  I am asking about the number of tasks
>    for the entire job and entire cluster.  That parameter is already
>    set to 3, which is one less than the number of cores on each node's
>    CPU, as recommended.In my question I stated   that
>    82 tasks were run for the first job, yet only 4 for the second -
>    both numbers being cluster-wide.
>
>
>
>> Date: Fri, 7 Jan 2011 13:19:42 -0800
>> Subject: Re: Help: How to increase amont maptasks per job ?
>> From: yuzhihong@gmail.com
>> To: common-user@hadoop.apache.org
>>
>> Set higher values for mapred.tasktracker.map.tasks.maximum (and
>> mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
>>
>> On Fri, Jan 7, 2011 at 12:58 PM, Tali K <ncherryus@hotmail.com> wrote:
>>
>> >
>> >
>> >
>> >
>> > We have a jobs which runs in several map/reduce stages.  In the first job,
>> > a large number of map tasks -82  are initiated, as expected.
>> > And that cause all nodes to be used.
>> >  In a
>> > later job, where we are still dealing with large amounts of
>> >  data, only 4 map tasks are initiated, and that caused to use only 4 nodes.
>> > This stage is actually the
>> > workhorse of the job, and requires much more processing power than the
>> > initial stage.
>> >  We are trying to understand why only a few map tasks are
>> > being used, as we are not getting the full advantage of our cluster.
>> >
>> >
>> >
>> >
>



-- 
Harsh J
www.harshj.com

Mime
View raw message