hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhizhen Hou (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-7855) RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception
Date Thu, 01 Feb 2018 07:15:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348122#comment-16348122
] 

Zhizhen Hou edited comment on YARN-7855 at 2/1/18 7:14 AM:
-----------------------------------------------------------

I have reproduced this error. I run a mapreduce job. At runtime, I find the MRAppMaster process
and kill it. The NodeManager will report this to ResourceManager. The ResourceManager will
report it to RMAppImpl object, and it will recreate a RMAppAttempt as current RMAppAttempt.
But during this period, the containers request by former MRAppMaster will be allocated to
current RMAppAttempt. The current RMAppAttempt can not deal this message . The state machine
does not include transition from current to CONTAINER_ALLOCATED.


was (Author: houzhizhen):
I have reproduce this error. I run a mapreduce job. At runtime, I find the MRAppMaster process
and kill it. The NodeManager will report this to ResourceManager. The ResourceManager will
report it to RMAppImpl object, and it will recreate a RMAppAttempt as current RMAppAttempt.
But during this period, the containers request by former MRAppMaster will be allocated to
current RMAppAttempt. The current RMAppAttempt can not deal this message . The state machine
does not include transition from current to CONTAINER_ALLOCATED.

> RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-7855
>                 URL: https://issues.apache.org/jira/browse/YARN-7855
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.5
>            Reporter: Zhizhen Hou
>            Priority: Major
>
> After upgrade hadoop from hadoop 2.6 to hadoop 2.7.5, the resourcemanager report the
following error log occasionally.
>  
> {code:java}
> 2018-01-30 14:12:41,349 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_ALLOCATED
at ALLOCATED_SAVING
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>         at java.lang.Thread.run(Thread.java:745)
> 2018-01-30 14:12:41,351 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_ALLOCATED
at ALLOCATED_SAVING
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>         at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message