Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 70185 invoked from network); 16 Feb 2010 11:31:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Feb 2010 11:31:51 -0000 Received: (qmail 87347 invoked by uid 500); 16 Feb 2010 11:31:51 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 87280 invoked by uid 500); 16 Feb 2010 11:31:51 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 87270 invoked by uid 99); 16 Feb 2010 11:31:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 11:31:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 11:31:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id ED29829A0011 for ; Tue, 16 Feb 2010 03:31:27 -0800 (PST) Message-ID: <1266295162.297341266319887970.JavaMail.jira@brutus.apache.org> Date: Tue, 16 Feb 2010 11:31:27 +0000 (UTC) From: "Hudson (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Commented: (MAPREDUCE-1398) TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. In-Reply-To: <442856609.7031264180161572.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834168#action_12834168 ] Hudson commented on MAPREDUCE-1398: ----------------------------------- Integrated in Hadoop-Mapreduce-trunk-Commit #242 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/242/]) . Fix TaskLauncher to stop waiting for slots on a TIP that is killed / failed. Contributed by Amareshwari Sriramadasu. > TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-1398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Reporter: Hemanth Yamijala > Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1398-1.txt, patch-1398-2.txt, patch-1398-ydist.txt, patch-1398.txt > > > Tasks could be assigned to trackers for slots that are running other tasks in a commit pending state. This is an optimization done to pipeline task assignment and launch. When the task reaches the tracker, it waits until sufficient slots become free for it. This wait is done in the TaskLauncher thread. Now, while waiting, if the task is killed externally (maybe because the job finishes, etc), the TaskLauncher is not notified of this. So, it continues to wait for the killed task to get sufficient slots. If slots do not become free for a long time, this would result in considerable delay in waking up the TaskLauncher thread. If the waiting task happens to be a high RAM task, then it is also wasteful, because by waking up, it can make way for normal tasks that can run on the available number of slots. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.