Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 77119 invoked from network); 19 Jun 2009 01:35:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Jun 2009 01:35:21 -0000 Received: (qmail 1866 invoked by uid 500); 19 Jun 2009 01:35:31 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 1805 invoked by uid 500); 19 Jun 2009 01:35:31 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 1795 invoked by uid 99); 19 Jun 2009 01:35:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jun 2009 01:35:31 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jun 2009 01:35:28 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 55A4F234C044 for ; Thu, 18 Jun 2009 18:35:07 -0700 (PDT) Message-ID: <1950407720.1245375307337.JavaMail.jira@brutus> Date: Thu, 18 Jun 2009 18:35:07 -0700 (PDT) From: "Namit Jain (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster In-Reply-To: <150681110.1245371887463.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721607#action_12721607 ] Namit Jain commented on HADOOP-6082: ------------------------------------ The JobTracker needs a new API: something like: decommissionTaskTrackers(int numberOfTT); The TaskTracker may need to expose a new API to return its current load. The JobTracker will get the current load of each TaskTracker and then decide to decommission the most lightly loaded 'n' takstrackers. When a TaskTracker is being decommissioned, it will stop accepting new jobs, and will die when all the current jobs are finished. This may lead to wastage of resources in the cluster. The jobtracker can optionally pass a timeout after which the tasktracker will definitely die. At that time, it might be a good idea to increase the number of retires for the tasks being executed. The UI may neeed to be changed to show the new status of the task tracker as well. > Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster > ---------------------------------------------------------------------------- > > Key: HADOOP-6082 > URL: https://issues.apache.org/jira/browse/HADOOP-6082 > Project: Hadoop Core > Issue Type: New Feature > Components: mapred > Reporter: dhruba borthakur > Assignee: Namit Jain > > There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.