Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 63107 invoked from network); 16 Feb 2010 11:05:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Feb 2010 11:05:49 -0000 Received: (qmail 57980 invoked by uid 500); 16 Feb 2010 11:05:49 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 57920 invoked by uid 500); 16 Feb 2010 11:05:49 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 57655 invoked by uid 99); 16 Feb 2010 11:05:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 11:05:48 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 11:05:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 0BC6829A0014 for ; Tue, 16 Feb 2010 03:05:28 -0800 (PST) Message-ID: <414658638.297111266318328047.JavaMail.jira@brutus.apache.org> Date: Tue, 16 Feb 2010 11:05:28 +0000 (UTC) From: "Amareshwari Sriramadasu (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Updated: (MAPREDUCE-1398) TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. In-Reply-To: <442856609.7031264180161572.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1398: ----------------------------------------------- Release Note: Fixed TaskLauncher to stop waiting for blocking slots, for a TIP that is killed / failed while it is in queue. > TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-1398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Reporter: Hemanth Yamijala > Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1398-1.txt, patch-1398-2.txt, patch-1398-ydist.txt, patch-1398.txt > > > Tasks could be assigned to trackers for slots that are running other tasks in a commit pending state. This is an optimization done to pipeline task assignment and launch. When the task reaches the tracker, it waits until sufficient slots become free for it. This wait is done in the TaskLauncher thread. Now, while waiting, if the task is killed externally (maybe because the job finishes, etc), the TaskLauncher is not notified of this. So, it continues to wait for the killed task to get sufficient slots. If slots do not become free for a long time, this would result in considerable delay in waking up the TaskLauncher thread. If the waiting task happens to be a high RAM task, then it is also wasteful, because by waking up, it can make way for normal tasks that can run on the available number of slots. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.