hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: how to set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum
Date Tue, 10 Jan 2012 10:28:31 GMT
Yes, divide the number of cores between map and reduce slots. Depending on your workload, start
with a 4:3 ratio and work your way to better tuning eventually (if you have more map-only
jobs, adjust ratio accordingly, etc.).

Changing slot params requires TaskTracker restarts alone, not JobTracker, so you can do it
without much troubles on a live cluster too.

On 10-Jan-2012, at 3:20 PM, hao.wang wrote:

> Hi,
>    Thanks for your help, your suggestion is very usefully.
>    I have another question that is whether the sum of maps and reduces equals to the
total number of cores.
> 
> regards!
> 
> 
> 2012-01-10 
> 
> 
> 
> hao.wang 
> 
> 
> 
> 发件人: Harsh J 
> 发送时间: 2012-01-10  16:44:07 
> 收件人: common-user 
> 抄送: 
> 主题: Re: how to set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum

> 
> Hello Hao,
> Am sorry if I confused you. By CPUs I meant the CPUs visible to your OS (/proc/cpuinfo),
so yes the total number of cores.
> On 10-Jan-2012, at 12:39 PM, hao.wang wrote:
>> Hi , 
>> 
>> Thanks for your reply!
>> According to your suggestion, Maybe I can't apply it to our hadoop cluster.
>> Cus, each server in our hadoop cluster just contains 2 CPUs. 
>>    So, I think maybe you mean the core #  but not CPU # in each searver? 
>> I am looking for your reply.
>> 
>> regards!
>> 
>> 
>> 2012-01-10 
>> 
>> 
>> 
>> hao.wang 
>> 
>> 
>> 
>> 发件人: Harsh J 
>> 发送时间: 2012-01-10  11:33:38 
>> 收件人: common-user 
>> 抄送: 
>> 主题: Re: how to set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum

>> 
>> Hello again,
>> Try a 4:3 ratio between maps and reduces, against a total # of available CPUs per
node (minus one or two, for DN and HBase if you run those). Then tweak it as you go (more
map-only loads or more map-reduce loads, that depends on your usage, and you can tweak the
ratio accordingly over time -- changing those props do not need JobTracker restarts, just
TaskTracker).
>> On 10-Jan-2012, at 8:17 AM, hao.wang wrote:
>>> Hi,
>>>  Thanks for your reply!
>>>  I had already read the pages before, can you give me sme more specific suggestions
about how to choose the values of  mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum
according to our cluster configuration if possible?
>>> 
>>> regards!
>>> 
>>> 
>>> 2012-01-10 
>>> 
>>> 
>>> 
>>> hao.wang 
>>> 
>>> 
>>> 
>>> 发件人: Harsh J 
>>> 发送时间: 2012-01-09  23:19:21 
>>> 收件人: common-user 
>>> 抄送: 
>>> 主题: Re: how to set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum

>>> 
>>> Hi,
>>> Please read http://hadoop.apache.org/common/docs/current/single_node_setup.html
to learn how to configure Hadoop using the various *-site.xml configuration files, and then
follow http://hadoop.apache.org/common/docs/current/cluster_setup.html to achieve optimal
configs for your cluster.
>>> On 09-Jan-2012, at 5:50 PM, hao.wang wrote:
>>>> Hi ,all
>>>> Our hadoop cluster has 22 nodes including one namenode, one jobtracker and
20 datanodes.
>>>> Each node has 2 * 12 cores with 32G RAM
>>>> Dose anyone tell me how to config following parameters:
>>>> mapred.tasktracker.map.tasks.maximum
>>>> mapred.tasktracker.reduce.tasks.maximum
>>>> 
>>>> regards!
>>>> 2012-01-09 
>>>> 
>>>> 
>>>> 
>>>> hao.wang 


Mime
View raw message