hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siqi Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4831) Recovered containers will be killed after NM stateful restart
Date Wed, 16 Mar 2016 18:37:33 GMT

    [ https://issues.apache.org/jira/browse/YARN-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197895#comment-15197895

Siqi Li commented on YARN-4831:

When NM does a stateful restart, the ContainerManagerImpl will try to recover applications,
and containers, and then send out ApplicationFinishEvent to apps that in appsState.getFinishedApplications().

The ApplicationFinishEvent could result in newly recovered containers to transit from NEW
to DONE with a KillOnNewTransition.
We could add an additional check in KillOnNewTransition to avoid killing completed containers.

> Recovered containers will be killed after NM stateful restart 
> --------------------------------------------------------------
>                 Key: YARN-4831
>                 URL: https://issues.apache.org/jira/browse/YARN-4831
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Siqi Li
> {code}
> 2016-03-04 19:43:48,130 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1456335621285_0040_01_000066 transitioned from NEW to DONE
> 2016-03-04 19:43:48,130 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger:
USER=henkins-service	OPERATION=Container Finished - Killed	TARGET=ContainerImpl	RESULT=SUCCESS
> {code}

This message was sent by Atlassian JIRA

View raw message