Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 95627 invoked from network); 17 Sep 2007 15:52:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Sep 2007 15:52:00 -0000 Received: (qmail 58878 invoked by uid 500); 17 Sep 2007 15:51:50 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 58866 invoked by uid 500); 17 Sep 2007 15:51:50 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 58852 invoked by uid 99); 17 Sep 2007 15:51:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Sep 2007 08:51:50 -0700 X-ASF-Spam-Status: No, hits=2.8 required=10.0 tests=RCVD_IN_DNSWL_LOW,RCVD_NUMERIC_HELO,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.50.2.13] (HELO ex9.myhostedexchange.com) (69.50.2.13) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Sep 2007 15:53:40 +0000 Received: from 75.80.179.210 ([75.80.179.210]) by ex9.hostedexchange.local ([69.50.2.13]) with Microsoft Exchange Server HTTP-DAV ; Mon, 17 Sep 2007 15:51:24 +0000 User-Agent: Microsoft-Entourage/11.3.3.061214 Date: Mon, 17 Sep 2007 08:51:19 -0700 Subject: Re: Hadoop ignores attempts to set mappers/machine? From: Ted Dunning To: Message-ID: Thread-Topic: Hadoop ignores attempts to set mappers/machine? Thread-Index: Acf5QpZp1SVWrmU1Edy8egAWy8rVfQ== In-Reply-To: <876ef97a0709170501yc2ef7e6y9731730479d24882@mail.gmail.com> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I have found that this parameter tends to be the limiting factor: mapred.tasktracker.tasks.maximum 3 The maximum number of tasks that will be run simultaneously by a task tracker. There are several competing constraints at work which makes it kind of hard to determine just how many map tasks will be run. On 9/17/07 5:01 AM, "Toby DiPasquale" wrote: > Hi all, > > No matter what I try, the number of mapper tasks on a given machine is > always 2. JobConf.setNumMapTasks(X) has no effect, nor does setting > mapred.map.tasks in the mapred-default.xml configuration. Why are > these settings ignored? How can I truly increase the number of map > tasks on a given machine? > > I ran a job last night (using 0.14.1) that took 31.5 minutes to map > 7.5 GB (on HDFS, not s3fs) and then 78 seconds to reduce the results > of that map (starting from 15% complete when the map phase hit 100%). > The map took so long because only 6 - 8 out of the 171 mappers were > running at any one time. I'd really like to know how to move the > needle on this one so if anyone has any insight, I'd really appreciate > it. Thanks.