hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers
Date Mon, 12 Sep 2016 09:19:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483565#comment-15483565
] 

Jian He edited comment on YARN-5620 at 9/12/16 9:18 AM:
--------------------------------------------------------

bq. It is also possible that the an admin logs into the NM and does a 'kill -9' which will
also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING
state.. right ?
I guess in this case,  it’s also fine to do the upgrade… because the upgrade API does
accept it, it’s hard to distinguish which one should go first.. It's also likely the reverse
can also happen because it's transient, if the setKillForReInitialization is called first,
then the container process is killed, It will be considered as re-init, even though it is
killed by external signal.  so keep it consistent ?
bq. Actually if you look at the prepareContainerUpgrade() function,
ah, yes,  mislooked . thank you !
bq. The problem with getAllResourcesByVisibility, is it gets all resources. I just need the
pending resources
In this case, the pendingResources in the same as the getAllResourcesByVisibility, right?
basically, I meant like below.. and the newly added methods could be not needed.
{code}
Map<LocalResourceVisibility, Collection<LocalResourceRequest>>
    pendingResources = ((ContainerReInitEvent) event).getResourceSet()
    .getAllResourcesByVisibility();
if (!pendingResources.isEmpty()) {
  container.dispatcher.getEventHandler().handle(
      new ContainerLocalizationRequestEvent(container, pendingResources));
} else {
{code}
- Forgot to say, similarly, is the change in ResourceLocalizedWhileRunningTransition required.
as the symlinks are also already distinct. 


was (Author: jianhe):
bq. It is also possible that the an admin logs into the NM and does a 'kill -9' which will
also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING
state.. right ?
I guess in this case,  it’s also fine to do the upgrade… because the upgrade API does
accept it, it’s hard to distinguish which one should go first.. It's also likely the reverse
can also happen because it's transient, if the setKillForReInitialization is called first,
then the container process is killed, It will be considered as re-init, even though it is
killed by external signal.  so keep it consistent ?
bq. Actually if you look at the prepareContainerUpgrade() function,
ah, yes,  mislooked . thank you !
bq. The problem with getAllResourcesByVisibility, is it gets all resources. I just need the
pending resources
In this case, the pendingResources in the same as the getAllResourcesByVisibility, right?
basically, I meant like below.. and the newly added methods could be not needed.
{code}
Map<LocalResourceVisibility, Collection<LocalResourceRequest>>
    pendingResources = ((ContainerReInitEvent) event).getResourceSet()
    .getAllResourcesByVisibility();
if (!pendingResources.isEmpty()) {
  container.dispatcher.getEventHandler().handle(
      new ContainerLocalizationRequestEvent(container, pendingResources));
} else {
{code}

> Core changes in NodeManager to support for upgrade and rollback of Containers
> -----------------------------------------------------------------------------
>
>                 Key: YARN-5620
>                 URL: https://issues.apache.org/jira/browse/YARN-5620
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-5620.001.patch, YARN-5620.002.patch, YARN-5620.003.patch, YARN-5620.004.patch,
YARN-5620.005.patch, YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, YARN-5620.009.patch,
YARN-5620.010.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to support upgrade
of a running container with a new {{ContainerLaunchContext}} as well as the ability to rollback
the upgrade if the container is not able to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message