Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 35017 invoked from network); 16 Feb 2010 05:54:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Feb 2010 05:54:49 -0000 Received: (qmail 86398 invoked by uid 500); 16 Feb 2010 05:54:49 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 86310 invoked by uid 500); 16 Feb 2010 05:54:49 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 86295 invoked by uid 99); 16 Feb 2010 05:54:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 05:54:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Feb 2010 05:54:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 294B2234C4AC for ; Mon, 15 Feb 2010 21:54:28 -0800 (PST) Message-ID: <830231235.292321266299668167.JavaMail.jira@brutus.apache.org> Date: Tue, 16 Feb 2010 05:54:28 +0000 (UTC) From: "Amareshwari Sriramadasu (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Updated: (MAPREDUCE-1398) TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. In-Reply-To: <442856609.7031264180161572.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1398: ----------------------------------------------- Status: Patch Available (was: Open) > TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed. > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-1398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Reporter: Hemanth Yamijala > Assignee: Amareshwari Sriramadasu > Attachments: patch-1398-1.txt, patch-1398-2.txt, patch-1398.txt > > > Tasks could be assigned to trackers for slots that are running other tasks in a commit pending state. This is an optimization done to pipeline task assignment and launch. When the task reaches the tracker, it waits until sufficient slots become free for it. This wait is done in the TaskLauncher thread. Now, while waiting, if the task is killed externally (maybe because the job finishes, etc), the TaskLauncher is not notified of this. So, it continues to wait for the killed task to get sufficient slots. If slots do not become free for a long time, this would result in considerable delay in waking up the TaskLauncher thread. If the waiting task happens to be a high RAM task, then it is also wasteful, because by waking up, it can make way for normal tasks that can run on the available number of slots. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.