Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Tue, 22 Jul 2014 21:24:39 +0000 (UTC)
From: "Jason Lowe (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12728868.1406039560804.23855.1406064279701@arcas>
In-Reply-To: <JIRA.12728868.1406039560804@arcas>
References: <JIRA.12728868.1406039560804@arcas>
Subject: [jira] [Commented] (YARN-2331) Distinguish shutdown during
 supervision vs. shutdown for rolling upgrade
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070925#comment-14070925 ] 

Jason Lowe commented on YARN-2331:
----------------------------------

Another possible approach is to have the NM always try to cleanup containers on a shutdown when it is unsupervised.  If a rolling upgrade needs to be performed and thus containers need to be preserved, the NM would be killed without the chance to cleanup (e.g.: kill -9 to deliver a SIGKILL).  Upon restart the NM would recover the state from the state store and reacquire the containers.

> Distinguish shutdown during supervision vs. shutdown for rolling upgrade
> ------------------------------------------------------------------------
>
>                 Key: YARN-2331
>                 URL: https://issues.apache.org/jira/browse/YARN-2331
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>
> When the NM is shutting down with restart support enabled there are scenarios we'd like to distinguish and behave accordingly:
> # The NM is running under supervision.  In that case containers should be preserved so the automatic restart can recover them.
> # The NM is not running under supervision and a rolling upgrade is not being performed.  In that case the shutdown should kill all containers since it is unlikely the NM will be restarted in a timely manner to recover them.
> # The NM is not running under supervision and a rolling upgrade is being performed.  In that case the shutdown should not kill all containers since a restart is imminent due to the rolling upgrade and the containers will be recovered.


--
This message was sent by Atlassian JIRA
(v6.2#6252)