hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Starry SHI <starr...@gmail.com>
Subject Re: Why I can only run 2 map/reduce task at a time?
Date Mon, 21 Dec 2009 16:13:47 GMT
Hi guys,

Thank you for your reply!

I have already set the configurations, and my cluster's Map/Reduce capacity
is more than 2 already. Also, the data are large enough to be put in more
than 2 MapTasks. However, when I launch the job, it seems no matter how I
set the configuration, there is only 2 map tasks running in the first
heartbeat time. To give an example, I use 4 map tasks. I expect the launch
should be like this:

task   |  offset to start time
--------------------------------
map1 |     0sec
map2 |     0~2sec
map3 |     0~2sec
map4 |     0~2sec

this shows that the 4 maptasks are launched simultaneously.

But the result turned out to be:

task   |  offset to start time
--------------------------------
map1 |     0sec
map2 |     1sec
map3 |     5sec
map4 |     6sec

This shows that in the first heartbeat(5sec), only two map tasks are
launched, the other 2 are waiting until the next heartbeat. I wonder why the
four map tasks cannot be launched together?

The 5sec delay will be amplified if the total number of machines increases,
and the execution time will be delayed significantly. I think if we can
manage to set the map tasks run as much as possible in one heartbeat and
takes all the available slots, there will be a great improvements in the
performance.

I would like to hear your suggestions and opinions on this.

Best regards,
Starry

/* Tomorrow is another day. So is today. */


On Mon, Dec 21, 2009 at 14:41, Chandraprakash Bhagtani <cpbhagtani@gmail.com
> wrote:

> You can increase the map/reduce slots
> using "mapred.tasktracker.map(reduce).tasks.maximum"  property only
>
> there can be following cases
>
> 1. your changes are not taking effect. you need to restart the cluster
> after
> making changes in conf xml.
>    you can check your cluster (Map/Reduce) capacity at
> http://jobtracker-address:50030/
>
> 2. your data is not enough to create more than 2 map tasks. But in that
> case
> reducers should still be equal
>    to mapred.reduce.tasks
>
> On Mon, Dec 21, 2009 at 9:39 AM, Starry SHI <starrysl@gmail.com> wrote:
>
> > Hi,
> >
> > I am currently using hadoop 0.19.2 to run large data processing. But I
> > noticed when the job is launched, there are only two map/reduce tasks
> > running in the very beginning. after one heartbeat (5sec), another two
> > map/reduce task is started. I want to ask how I can increase the
> map/reduce
> > slots?
> >
> > In the configuration file, I have already set
> > "mapred.tasktracker.map(reduce).tasks.maximum" to 10, and
> > "mapred.map(reduce).tasks" to 10. But there are still only 2 are
> launched.
> >
> > Eager to hear from your solutions!
> >
> > Best regards,
> > Starry
> >
> > /* Tomorrow is another day. So is today. */
> >
>
>
>
> --
> Thanks & Regards,
> Chandra Prakash Bhagtani,
> Impetus Infotech (india) Pvt Ltd.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message