Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 70143 invoked from network); 8 Jul 2010 16:09:15 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Jul 2010 16:09:15 -0000 Received: (qmail 67438 invoked by uid 500); 8 Jul 2010 16:09:12 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 67395 invoked by uid 500); 8 Jul 2010 16:09:12 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 67387 invoked by uid 99); 8 Jul 2010 16:09:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jul 2010 16:09:11 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kengoodhope@gmail.com designates 74.125.83.176 as permitted sender) Received: from [74.125.83.176] (HELO mail-pv0-f176.google.com) (74.125.83.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jul 2010 16:09:04 +0000 Received: by pvc21 with SMTP id 21so798231pvc.35 for ; Thu, 08 Jul 2010 09:08:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=+Mo3NZky5B7NvcudLYOvr1V9ODibKTzrO/u6wdyjuqw=; b=Mfk+6Jr85vpbtHUbMoeD1Z1GWIV9QgcQ9fgX2032gsP027CzVp2/dS8GAPzlSLC2+u ak5WCinXI29vichBqnLaIRWdLVk3NLghaTC43JPtzxsnFO+CQP1A4Vw9DNYjYywQK9go YHIWVuCu7gMx7XTC3RZ8OEtPQ97/R6gV/0gF0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Vqnhh1FkKfX4JrTum7RkBDQo292EGHJrKdMtWdKkvkw87NvQ3TkREs+n0Q2nUD/lv2 vqi1Cm6yKi2/w2+7qL58agOrYPjouAAQFNXlweLaP22Q4IPc763p408fbeyu74GneTJq YxjSz+e7s7XumQAW/+H5MCjIOZM1LbwrP14jE= MIME-Version: 1.0 Received: by 10.142.158.13 with SMTP id g13mr9182857wfe.232.1278605319046; Thu, 08 Jul 2010 09:08:39 -0700 (PDT) Received: by 10.142.245.2 with HTTP; Thu, 8 Jul 2010 09:08:38 -0700 (PDT) In-Reply-To: References: Date: Thu, 8 Jul 2010 09:08:38 -0700 Message-ID: Subject: Re: How to control the number of map tasks for each nodes? From: Ken Goodhope To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd20cfed9866b048ae27f00 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd20cfed9866b048ae27f00 Content-Type: text/plain; charset=ISO-8859-1 If you want to have a different number of tasks for different nodes, you will need to look at one of the more advanced schedulers. FairScheduler and CapacityScheduler are the most common. FairScheduler has extensibility points where you can add your own logic for deciding if a particular node can schedule another task. I believe CapacityScheduler does this too, but i haven't used it as much. On Thu, Jul 8, 2010 at 6:49 AM, Jones, Nick wrote: > Vitaliy/Edward, > One thing to keep in mind is that overcommitting the number of cores can > lead to map timeouts unless the map task submits progress updates to > jobtracker. I found out the hard way that with a few computationally > expensive maps. > > Nick Jones > > -----Original Message----- > From: Vitaliy Semochkin [mailto:vitaliy.se@gmail.com] > Sent: Thursday, July 08, 2010 5:15 AM > To: common-user@hadoop.apache.org > Subject: Re: How to control the number of map tasks for each nodes? > > Hi, > > in mapred-site.xml you should place > > > mapred.tasktracker.map.tasks.maximum > 8 > the number of available cores on the tasktracker machines > for map tasks > > > > mapred.tasktracker.reduce.tasks.maximum > 8 > the number of available cores on the tasktracker machines > for reduce tasks > > > > where 8 is number of your CORES not CPUS, if you have 8 dual core > processors > place 16 there. > I found out that having number of map tasks a bit bigger than number of > cores is better cause sometimes hadoop waits for IO operations and task do > nothing. > > Regards, > Vitaliy S > > On Thu, Jul 8, 2010 at 1:07 PM, edward choi wrote: > > > Hi, > > > > I have a cluster consisting of 11 slaves and a single master. > > > > The thing is that 3 of my slaves have i7 cpu which means that they can > have > > up to 8 simultaneous processes. > > But other slaves only have dual core cpus. > > > > So I was wondering if I can specify the number of map tasks for each of > my > > slaves. > > For example, I want to give 8 map tasks to the slaves that have i7 cpus > and > > only two map tasks to the others. > > > > Is there a way to do this? > > > > --000e0cd20cfed9866b048ae27f00--