Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates
 209.85.210.176 as permitted sender)
Content-Type: text/plain; charset=GB2312
Mime-Version: 1.0 (Apple Message framework v1084)
Subject: Re: how to set mapred.tasktracker.map.tasks.maximum and
 mapred.tasktracker.reduce.tasks.maximum
From: Harsh J <harsh@cloudera.com>
In-Reply-To: <201201101750087597868@ipinyou.com>
Date: Tue, 10 Jan 2012 15:58:31 +0530
Content-Transfer-Encoding: quoted-printable
Message-Id: <4F78908B-1160-4EE6-BC1E-AE5781EA1C5D@cloudera.com>
References: <201201092020447119807@ipinyou.com>,
 <201201101047081131988@ipinyou.com>,
 <201201101509236516928@ipinyou.com> <201201101750087597868@ipinyou.com>
To: common-user@hadoop.apache.org

Yes, divide the number of cores between map and reduce slots. Depending =
on your workload, start with a 4:3 ratio and work your way to better =
tuning eventually (if you have more map-only jobs, adjust ratio =
accordingly, etc.).

Changing slot params requires TaskTracker restarts alone, not =
JobTracker, so you can do it without much troubles on a live cluster =
too.

On 10-Jan-2012, at 3:20 PM, hao.wang wrote:

> Hi,
>    Thanks for your help, your suggestion is very usefully.
>    I have another question that is whether the sum of maps and reduces =
equals to the total number of cores.
>=20
> regards!
>=20
>=20
> 2012-01-10=20
>=20
>=20
>=20
> hao.wang=20
>=20
>=20
>=20
> =B7=A2=BC=FE=C8=CB=A3=BA Harsh J=20
> =B7=A2=CB=CD=CA=B1=BC=E4=A3=BA 2012-01-10  16:44:07=20
> =CA=D5=BC=FE=C8=CB=A3=BA common-user=20
> =B3=AD=CB=CD=A3=BA=20
> =D6=F7=CC=E2=A3=BA Re: how to set mapred.tasktracker.map.tasks.maximum =
and mapred.tasktracker.reduce.tasks.maximum=20
>=20
> Hello Hao,
> Am sorry if I confused you. By CPUs I meant the CPUs visible to your =
OS (/proc/cpuinfo), so yes the total number of cores.
> On 10-Jan-2012, at 12:39 PM, hao.wang wrote:
>> Hi ,=20
>>=20
>> Thanks for your reply!
>> According to your suggestion, Maybe I can't apply it to our hadoop =
cluster.
>> Cus, each server in our hadoop cluster just contains 2 CPUs.=20
>>    So, I think maybe you mean the core #  but not CPU # in each =
searver?=20
>> I am looking for your reply.
>>=20
>> regards!
>>=20
>>=20
>> 2012-01-10=20
>>=20
>>=20
>>=20
>> hao.wang=20
>>=20
>>=20
>>=20
>> =B7=A2=BC=FE=C8=CB=A3=BA Harsh J=20
>> =B7=A2=CB=CD=CA=B1=BC=E4=A3=BA 2012-01-10  11:33:38=20
>> =CA=D5=BC=FE=C8=CB=A3=BA common-user=20
>> =B3=AD=CB=CD=A3=BA=20
>> =D6=F7=CC=E2=A3=BA Re: how to set =
mapred.tasktracker.map.tasks.maximum and =
mapred.tasktracker.reduce.tasks.maximum=20
>>=20
>> Hello again,
>> Try a 4:3 ratio between maps and reduces, against a total # of =
available CPUs per node (minus one or two, for DN and HBase if you run =
those). Then tweak it as you go (more map-only loads or more map-reduce =
loads, that depends on your usage, and you can tweak the ratio =
accordingly over time -- changing those props do not need JobTracker =
restarts, just TaskTracker).
>> On 10-Jan-2012, at 8:17 AM, hao.wang wrote:
>>> Hi,
>>>  Thanks for your reply!
>>>  I had already read the pages before, can you give me sme more =
specific suggestions about how to choose the values of  =
mapred.tasktracker.map.tasks.maximum and =
mapred.tasktracker.reduce.tasks.maximum according to our cluster =
configuration if possible?
>>>=20
>>> regards!
>>>=20
>>>=20
>>> 2012-01-10=20
>>>=20
>>>=20
>>>=20
>>> hao.wang=20
>>>=20
>>>=20
>>>=20
>>> =B7=A2=BC=FE=C8=CB=A3=BA Harsh J=20
>>> =B7=A2=CB=CD=CA=B1=BC=E4=A3=BA 2012-01-09  23:19:21=20
>>> =CA=D5=BC=FE=C8=CB=A3=BA common-user=20
>>> =B3=AD=CB=CD=A3=BA=20
>>> =D6=F7=CC=E2=A3=BA Re: how to set =
mapred.tasktracker.map.tasks.maximum and =
mapred.tasktracker.reduce.tasks.maximum=20
>>>=20
>>> Hi,
>>> Please read =
http://hadoop.apache.org/common/docs/current/single_node_setup.html to =
learn how to configure Hadoop using the various *-site.xml configuration =
files, and then follow =
http://hadoop.apache.org/common/docs/current/cluster_setup.html to =
achieve optimal configs for your cluster.
>>> On 09-Jan-2012, at 5:50 PM, hao.wang wrote:
>>>> Hi ,all
>>>> Our hadoop cluster has 22 nodes including one namenode, one =
jobtracker and 20 datanodes.
>>>> Each node has 2 * 12 cores with 32G RAM
>>>> Dose anyone tell me how to config following parameters:
>>>> mapred.tasktracker.map.tasks.maximum
>>>> mapred.tasktracker.reduce.tasks.maximum
>>>>=20
>>>> regards!
>>>> 2012-01-09=20
>>>>=20
>>>>=20
>>>>=20
>>>> hao.wang=20