hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alx...@aim.com
Subject Re: number of mapred slots
Date Tue, 18 Dec 2012 06:39:20 GMT
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other
slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop
and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver
and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled
10 reducers and all of them started at once. There was no pending reducers. Can anyone explain,
why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes
already running in slave nodes.

Thanks.
Alex.


  

 

-----Original Message-----
From: Chris Embree <cembree@gmail.com>
To: user <user@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots


I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores
for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event.
:)


TEST, TEST, TEST.  :)


On Tue, Dec 18, 2012 at 1:08 AM,  <alxsss@aim.com> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number
of cpus on the net. All I found was that it is advisable to schedule two processes for one
cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode,
tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional
mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.



 

Mime
View raw message