Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B6D17DD24 for ; Tue, 18 Dec 2012 06:39:54 +0000 (UTC) Received: (qmail 88562 invoked by uid 500); 18 Dec 2012 06:39:49 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 88354 invoked by uid 500); 18 Dec 2012 06:39:49 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 88322 invoked by uid 99); 18 Dec 2012 06:39:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Dec 2012 06:39:48 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alxsss@aim.com designates 205.188.91.96 as permitted sender) Received: from [205.188.91.96] (HELO imr-db02.mx.aol.com) (205.188.91.96) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Dec 2012 06:39:41 +0000 Received: from mtaomg-da01.r1000.mx.aol.com (mtaomg-da01.r1000.mx.aol.com [172.29.51.137]) by imr-db02.mx.aol.com (Outbound Mail Relay) with ESMTP id DA5BF1C000127; Tue, 18 Dec 2012 01:39:20 -0500 (EST) Received: from core-mia002c.r1000.mail.aol.com (core-mia002.r1000.mail.aol.com [172.29.120.197]) by mtaomg-da01.r1000.mx.aol.com (OMAG/Core Interface) with ESMTP id A461AE000082; Tue, 18 Dec 2012 01:39:20 -0500 (EST) References: <8CFAAFD45EBD9E0-1C2C-24697@webmail-d038.sysops.aol.com> To: user@hadoop.apache.org, chris@embree.us Subject: Re: number of mapred slots In-Reply-To: X-MB-Message-Source: WebUI MIME-Version: 1.0 From: alxsss@aim.com X-MB-Message-Type: User Content-Type: multipart/alternative; boundary="--------MB_8CFAB018B473518_1C2C_84B10_webmail-d038.sysops.aol.com" X-Mailer: AOL Webmail 37267-STANDARD Received: from 98.154.170.255 by webmail-d038.sysops.aol.com (205.188.181.87) with HTTP (WebMailUI); Tue, 18 Dec 2012 01:39:20 -0500 Message-Id: <8CFAB018B3DAF8C-1C2C-247D2@webmail-d038.sysops.aol.com> X-Originating-IP: [98.154.170.255] Date: Tue, 18 Dec 2012 01:39:20 -0500 (EST) x-aol-global-disposition: G DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mx.aol.com; s=20121107; t=1355812760; bh=ai5mbM/Dc1sZCeNrqxW22IYvDT5dTO/w+ZnDLd3SakQ=; h=From:To:Subject:Message-Id:Date:MIME-Version:Content-Type; b=MR5OvkVW5gHMYW14nAQyH/PYHGnLlukf7Db3eWZHG35Tb91COJ3Xt7qT15auda7AU /5XXaMYruTLQ3fk2z8C/a0nG29LwAxB2IrC2WuQOpJDzwANkcjSLT8A10LbzUAPdaU 9cwGQLokYAhfCknKIfBMIFi7Dm/lIAnnzgAxpCT4= X-AOL-SCOLL-SCORE: 0:2:486854368:93952408 X-AOL-SCOLL-URL_COUNT: 0 x-aol-sid: 3039ac1d338950d00f981a64 X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ----------MB_8CFAB018B473518_1C2C_84B10_webmail-d038.sysops.aol.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" I have two slave nodes and one master. One slave node has quad core(4 cpus)= (16GB mem) the other slave has dual core (2 cpus) (16 GB mem) and master h= as dual core 4GB mem. I run hadoop and hbase. So, both slaves have already= 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and = I have this config in mapred-side.xml mapred.tasktracker.map.tasks.maximum 10 the number of available cores on the tasktracker machines for map tasks mapred.tasktracker.reduce.tasks.maximum 7 the number of available cores on the tasktracker machines for reduce tasks mapred.map.tasks 10 define mapred.map tasks to be number of slave hosts mapred.reduce.tasks 7 define mapred.reduce tasks to be number of slave hosts =20 =20 To my understanding this means that number of reduce tasks must be 7. Howe= ver, hadoop scheduled 10 reducers and all of them started at once. There wa= s no pending reducers. Can anyone explain, why 10 reducers were running and= where those slots come from, if there were 6 cpus and 8 processes already = running in slave nodes. Thanks. Alex. =20 =20 -----Original Message----- From: Chris Embree To: user Sent: Mon, Dec 17, 2012 10:12 pm Subject: Re: number of mapred slots I think the rule of thumb (hortonworks at least) is 2x cores for maps threa= ds and 1x cores for reducers. Don't have my notes here so I'm not 100%. I= t's just a guideline in any event. :) TEST, TEST, TEST. :) On Tue, Dec 18, 2012 at 1:08 AM, wrote: Hello, I was unable to find any information regarding relationship between mapred = slots and number of cpus on the net. All I found was that it is advisable t= o schedule two processes for one cpu. If this is true, then for a slave n= ode with dual core( two cpus) that runs datanode, tasktracker, hbase region= server and zookeeper, theoretically there is no space to run an additional = mapred task. Any comment on this is welcome. In general what is the mapred slot and how is it related to number of cpu c= ores? Thanks in advance. Alex. =20 ----------MB_8CFAB018B473518_1C2C_84B10_webmail-d038.sysops.aol.com Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="us-ascii" I have two= slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem)= the other slave has dual core (2 cpus) (16 GB mem) = and master has dual core 4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbas= e regionserver and zookepper) and I= have this config in mapred-side.xml

<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>10</value>
   <description>the number of available cores on the tasktr= acker machines
for map tasks
</description>
</property>

<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>7</value>
   <description>the number of available cores on the tasktr= acker machines
for reduce tasks
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>10</value>
<description>
    define mapred.map tasks to be number of slave hosts
</description>
</property>

<property>
<name>mapred.reduce.tasks</name>
<value>7</value>
<description>
    define mapred.reduce tasks to be number of slave hosts </description>
</property>

To my understanding this means that number of reduce= tasks must be 7. However, hadoop scheduled 10 reducers an= d all of them started at once. There was no pending reduce= rs. Can anyone explain, why 10 reducers were running and where those slots = come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.

-----Original Message-----
From: Chris Embree <cembree@gmail.com>
To: user <user@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots

I think the rule of thumb (hortonworks at least) is 2x cores for maps threa= ds and 1x cores for reducers. Don't have my notes here so I'm not 100= %. It's just a guideline in any event. :)

TEST, TEST, TEST. &nbs= p;:)

= On Tue, Dec 18, 2012 at 1:08 AM, <alxsss@aim.com> wrote:

H= ello,

I was unable to find any information regarding relations= hip between mapred slots and number of cpus on the= net. All I found was that it is advisable to schedule = two processes for one cpu. If this is true, then for a slave no= de with dual core( two cpus) that runs datanode, tasktracker, hbase regions= erver and zookeeper, theoretically there is no space to run an additi= onal mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of= cpu cores?

Thanks in advance.
Alex.

----------MB_8CFAB018B473518_1C2C_84B10_webmail-d038.sysops.aol.com--