Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91F0810CCE for ; Tue, 10 Feb 2015 18:52:17 +0000 (UTC) Received: (qmail 90239 invoked by uid 500); 10 Feb 2015 18:52:17 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 90196 invoked by uid 500); 10 Feb 2015 18:52:17 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 90185 invoked by uid 99); 10 Feb 2015 18:52:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2015 18:52:16 +0000 Date: Tue, 10 Feb 2015 18:52:16 +0000 (UTC) From: "Junping Du (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-914) Support graceful decommission of nodemanager MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314653#comment-14314653 ] Junping Du commented on YARN-914: --------------------------------- Thanks [~vinodkv] for comments! bq. IAC, I think we should also have a CLI command to decommission the node which optionally waits till the decommission succeeds. That sounds pretty good. This new CLI can simply "gracefully" decommission related nodes and wait to timeout to forcefully decommission nodes haven't finished. Comparing with approach of external script proposed by Ming above, this has less dependency on effort that outside of hadoop. bq. Regarding storage of the decommission state, YARN-2567 also plans to make sure that the state of all nodes is maintained up to date on the state-store. That helps with many other cases too. We should combine these efforts. That make sense. However, YARN-2567 is about threshold thing, may be a wrong JIRA number? bq. Regarding long running services, I think it makes sense to let the admin initiating the decommission know - not in terms of policy but as a diagnostic. Other than waiting for a timeout, the admin may not have noticed that a service is running on this node before the decommission is triggered. bq. This is the umbrella concern I have. There are two ways to do this: Let YARN manage the decommission process or manage it on top of YARN. If the later is the approach, I don't see a lot to be done here besides YARN-291. No? Agree that there is less effort for 2nd approach. If so, we still need RM can aware containers/apps get finished then trigger shutdown to NM to make decommission comes earlier (and randomly) which I guess is important to upgrade of large cluster. Isn't it? For YARN-291, my understanding is now we don't rely on any open issues left there because we only need to set NM's resource to 0 at runtime which we already provide there. BTW, I think the approach you just proposed above is "2nd approach + a new CLI". Isn't it? I prefer to go with this way but would like to hear other guys' ideas here also. > Support graceful decommission of nodemanager > -------------------------------------------- > > Key: YARN-914 > URL: https://issues.apache.org/jira/browse/YARN-914 > Project: Hadoop YARN > Issue Type: Improvement > Affects Versions: 2.0.4-alpha > Reporter: Luke Lu > Assignee: Junping Du > Attachments: Gracefully Decommission of NodeManager (v1).pdf > > > When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable to minimize the impact to running applications. > Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled on other NMs. Further more, for finished map tasks, if their map output are not fetched by the reducers of the job, these map tasks will need to be rerun as well. > We propose to introduce a mechanism to optionally gracefully decommission a node manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)