From mapreduce-issues-return-18934-apmail-hadoop-mapreduce-issues-archive=hadoop.apache.org@hadoop.apache.org Tue Feb 01 23:39:20 2011 Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 99333 invoked from network); 1 Feb 2011 23:39:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Feb 2011 23:39:20 -0000 Received: (qmail 37877 invoked by uid 500); 1 Feb 2011 23:39:20 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 37823 invoked by uid 500); 1 Feb 2011 23:39:19 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 37813 invoked by uid 99); 1 Feb 2011 23:39:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 23:39:19 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 23:39:19 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 00C11188CC0 for ; Tue, 1 Feb 2011 23:38:29 +0000 (UTC) Date: Tue, 1 Feb 2011 23:38:28 +0000 (UTC) From: "Scott Chen (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <687404693.4047.1296603508999.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989454#comment-12989454 ] Scott Chen commented on MAPREDUCE-1783: --------------------------------------- I just committed this to 0.22. > Task Initialization should be delayed till when a job can be run > ---------------------------------------------------------------- > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share > Affects Versions: 0.20.1 > Reporter: Ramkumar Vadali > Assignee: Ramkumar Vadali > Fix For: 0.22.0, 0.23.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the number of jobs that can be running at a given time. However, jobs that are submitted are initiaiized immediately by EagerTaskInitializationListener by calling JobInProgress.initTasks. This causes the job split file to be read into memory. The split information is not needed until the number of running jobs is less than the maximum specified. If the amount of split information is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of JobInProgressListener that is aware of PoolManager limits and can delay task initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira