hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Goodhope <kengoodh...@gmail.com>
Subject Re: How to control the number of map tasks for each nodes?
Date Thu, 08 Jul 2010 16:08:38 GMT
If you want to have a different number of tasks for different nodes, you
will need to look at one of the more advanced schedulers.  FairScheduler and
CapacityScheduler are the most common.  FairScheduler has extensibility
points where you can add your own logic for deciding if a particular node
can schedule another task.  I believe CapacityScheduler does this too, but i
haven't used it as much.

On Thu, Jul 8, 2010 at 6:49 AM, Jones, Nick <nick.jones@amd.com> wrote:

> Vitaliy/Edward,
> One thing to keep in mind is that overcommitting the number of cores can
> lead to map timeouts unless the map task submits progress updates to
> jobtracker.  I found out the hard way that with a few computationally
> expensive maps.
>
> Nick Jones
>
> -----Original Message-----
> From: Vitaliy Semochkin [mailto:vitaliy.se@gmail.com]
> Sent: Thursday, July 08, 2010 5:15 AM
> To: common-user@hadoop.apache.org
> Subject: Re: How to control the number of map tasks for each nodes?
>
> Hi,
>
> in mapred-site.xml you should place
>
> <property>
>  <name>mapred.tasktracker.map.tasks.maximum</name>
>  <value>8</value>
>   <description>the number of available cores on the tasktracker machines
> for map tasks
>  </description>
> </property>
> <property>
>  <name>mapred.tasktracker.reduce.tasks.maximum</name>
>  <value>8</value>
>   <description>the number of available cores on the tasktracker machines
> for reduce tasks
>  </description>
> </property>
>
> where 8 is number of your CORES not CPUS, if you have 8 dual core
> processors
> place 16 there.
> I found out that having number of map tasks a bit bigger than number of
> cores is better cause sometimes hadoop waits for IO operations and task do
> nothing.
>
> Regards,
> Vitaliy S
>
> On Thu, Jul 8, 2010 at 1:07 PM, edward choi <mp2893@gmail.com> wrote:
>
> > Hi,
> >
> > I have a cluster consisting of 11 slaves and a single master.
> >
> > The thing is that 3 of my slaves have i7 cpu which means that they can
> have
> > up to 8 simultaneous processes.
> > But other slaves only have dual core cpus.
> >
> > So I was wondering if I can specify the number of map tasks for each of
> my
> > slaves.
> > For example, I want to give 8 map tasks to the slaves that have i7 cpus
> and
> > only two map tasks to the others.
> >
> > Is there a way to do this?
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message