hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2331) Distinguish shutdown during supervision vs. shutdown for rolling upgrade
Date Tue, 22 Jul 2014 21:24:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070925#comment-14070925
] 

Jason Lowe commented on YARN-2331:
----------------------------------

Another possible approach is to have the NM always try to cleanup containers on a shutdown
when it is unsupervised.  If a rolling upgrade needs to be performed and thus containers need
to be preserved, the NM would be killed without the chance to cleanup (e.g.: kill -9 to deliver
a SIGKILL).  Upon restart the NM would recover the state from the state store and reacquire
the containers.

> Distinguish shutdown during supervision vs. shutdown for rolling upgrade
> ------------------------------------------------------------------------
>
>                 Key: YARN-2331
>                 URL: https://issues.apache.org/jira/browse/YARN-2331
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>
> When the NM is shutting down with restart support enabled there are scenarios we'd like
to distinguish and behave accordingly:
> # The NM is running under supervision.  In that case containers should be preserved so
the automatic restart can recover them.
> # The NM is not running under supervision and a rolling upgrade is not being performed.
 In that case the shutdown should kill all containers since it is unlikely the NM will be
restarted in a timely manner to recover them.
> # The NM is not running under supervision and a rolling upgrade is being performed. 
In that case the shutdown should not kill all containers since a restart is imminent due to
the rolling upgrade and the containers will be recovered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message