[ https://issues.apache.org/jira/browse/YARN-8650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
lujie updated YARN-8650:
------------------------
Attachment: hadoop-hires-nodemanager-hadoop11.log
> Invalid event: CONTAINER_KILLED_ON_REQUEST at DONE and Invalid event: CONTAINER_LAUNCHED
at DONE
> -------------------------------------------------------------------------------------------------
>
> Key: YARN-8650
> URL: https://issues.apache.org/jira/browse/YARN-8650
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: lujie
> Priority: Major
> Attachments: hadoop-hires-nodemanager-hadoop11.log, hadoop-hires-nodemanager-hadoop15.log
>
>
> We have tested the hadoop while nodemanager is shutting down and encounter two InvalidStateTransitionException:
> {code:java}
> 2018-08-04 14:29:33,025 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
Can't handle this event at current state: Current: [DONE], eventType: [CONTAINER_KILLED_ON_REQUEST],
container: [container_1533364185282_0001_01_000001]
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: CONTAINER_KILLED_ON_REQUEST
at DONE
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:2084)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:103)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1483)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1476)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: CONTAINER_LAUNCHED
at DONE
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:2084)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:103)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1483)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1476)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> We have analysis these two bugs, and find that shutdown will send kill event and hence
cause these two exception. We have test the our cluster for many time and can determinately
reproduce it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org
|