Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 9438 invoked from network); 9 Oct 2008 06:35:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Oct 2008 06:35:38 -0000 Received: (qmail 39214 invoked by uid 500); 9 Oct 2008 06:35:36 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 39202 invoked by uid 500); 9 Oct 2008 06:35:36 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 39191 invoked by uid 99); 9 Oct 2008 06:35:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Oct 2008 23:35:36 -0700 X-ASF-Spam-Status: No, hits=-1999.9 required=10.0 tests=ALL_TRUSTED,DNS_FROM_SECURITYSAGE X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Oct 2008 06:34:40 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 321D4234C213 for ; Wed, 8 Oct 2008 23:34:46 -0700 (PDT) Message-ID: <1367965497.1223534086188.JavaMail.jira@brutus> Date: Wed, 8 Oct 2008 23:34:46 -0700 (PDT) From: "dhruba borthakur (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-4018) limit memory usage in jobtracker In-Reply-To: <1787086759.1219689704288.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HADOOP-4018: ------------------------------------- Fix Version/s: 0.19.0 Thanks Amar for reviewing it. I am marking it for 0.19 because this limit is very necessary for clusters that have permanent JobTrackers (not using HOD). Otherwise a single erroneous job could swamp the entire cluster. The fix is very low-risk. I am proposing that this fix gets into 0.19 branch. > limit memory usage in jobtracker > -------------------------------- > > Key: HADOOP-4018 > URL: https://issues.apache.org/jira/browse/HADOOP-4018 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Fix For: 0.19.0 > > Attachments: maxSplits.patch, maxSplits10.patch, maxSplits2.patch, maxSplits3.patch, maxSplits4.patch, maxSplits5.patch, maxSplits6.patch, maxSplits7.patch, maxSplits8.patch, maxSplits9.patch > > > We have seen instances when a user submitted a job with many thousands of mappers. The JobTracker was running with 3GB heap, but it was still not enough to prevent memory trashing from Garbage collection; effectively the Job Tracker was not able to serve jobs and had to be restarted. > One simple proposal would be to limit the maximum number of tasks per job. This can be a configurable parameter. Is there other things that eat huge globs of memory in job Tracker? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.