Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 35747 invoked from network); 9 Jun 2010 05:14:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Jun 2010 05:14:37 -0000 Received: (qmail 2042 invoked by uid 500); 9 Jun 2010 05:14:37 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 1687 invoked by uid 500); 9 Jun 2010 05:14:37 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 1677 invoked by uid 99); 9 Jun 2010 05:14:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 05:14:36 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 05:14:34 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o595EDHZ009895 for ; Wed, 9 Jun 2010 05:14:13 GMT Message-ID: <21374184.41191276060453006.JavaMail.jira@thor> Date: Wed, 9 Jun 2010 01:14:13 -0400 (EDT) From: "Amareshwari Sriramadasu (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Resolved: (MAPREDUCE-1781) option "-D mapred.tasktracker.map.tasks.maximum=1" does not work when no of mappers is bigger than no of nodes - always spawns 2 mapers/node In-Reply-To: <18111921.1561273519110663.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu resolved MAPREDUCE-1781. ------------------------------------------------ Resolution: Invalid bq. Regarding the initial problem, I think it would help a lot of people (especially new users) to specify in the config page[ http://hadoop.apache.org/common/docs/current/mapred-default.html ] which parameters are set at startup and which at job runtime. In branch 0.21, the configuration names are standardized through MAPREDUCE-849. The configuration names with prefix as mapreduce.cluster/mapreduce.jobtracker/mapreduce.tasktracker are server level configurations and need to be setup before the cluster is brought up. The other configurations with prefix mapreduce.job/mapreduce.task/mapreduce.map/mapreduce.reduce are job level configurations. Documenting all of them in mapred-default is being tracked in MAPREDUCE-1021. Closing this as invalid. > option "-D mapred.tasktracker.map.tasks.maximum=1" does not work when no of mappers is bigger than no of nodes - always spawns 2 mapers/node > -------------------------------------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-1781 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1781 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming > Affects Versions: 0.20.2 > Environment: Debian Lenny x64, and Hadoop 0.20.2, 2GB RAM > Reporter: Tudor Vlad > > Hello > I am a new user of Hadoop and I have some trouble using Hadoop Streaming and the "-D mapred.tasktracker.map.tasks.maximum" option. > I'm experimenting with an unmanaged application (C++) which I want to run over several nodes in 2 scenarios > 1) the number of maps (input splits) is equal to the number of nodes > 2) the number of maps is a multiple of the number of nodes (5, 10, 20, ... > Initially, when running the tests in scenario 1 I would sometimes get 2 process/node on half the nodes. However I fixed this by adding the optin "-D mapred.tasktracker.map.tasks.maximum=1", so everything works fine. > In the case of scenario 2 (more maps than nodes) this directive no longer works, always obtaining 2 processes/node. I tested the even with putting maximum=5 and I still get 2 processes/node. > The entire command I use is: > /usr/bin/time --format="-duration:\t%e |\t-MFaults:\t%F |\t-ContxtSwitch:\t%w" \ > /opt/hadoop/bin/hadoop jar /opt/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \ > -D mapred.tasktracker.map.tasks.maximum=1 \ > -D mapred.map.tasks=30 \ > -D mapred.reduce.tasks=0 \ > -D io.file.buffer.size=5242880 \ > -libjars "/opt/hadoop/contrib/streaming/hadoop-7debug.jar" \ > -input input/test \ > -output out1 \ > -mapper "/opt/jobdata/script_1k" \ > -inputformat "me.MyInputFormat" > Why is this happening and how can I make it work properly (i.e. be able to limit exactly how many mappers I can have at 1 time per node)? > Thank you in advance -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.