hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lujie (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
Date Thu, 04 Jan 2018 07:40:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310909#comment-16310909
] 

lujie edited comment on YARN-7663 at 1/4/18 7:39 AM:
-----------------------------------------------------

After reading Jason Lowe useful suggestion. I change the unit test and attach the new patch
.
In this patch ,I do three  three things: 1. add empty protected methodon:onInvalidStateTransition,
and add its callsite in the code block that RMAppImpl handle InvalidStateTransition2.create
a new final class RMAppImplForTest which override onInvalidStateTransition.In createNewTestApp,
create RMAppImplForTest  object instead of  RMAppImpl. 3. fix this bug by ignore the event

But there are another two InvalidStateTransition while testing:1.testAppAcceptedFailed:APP_ACCEPTED
at state 2.testAppRunningFailed:ACCEPTED,APP_UPDATE_SAVED at state KILLED.
These two InvalidStateTransition maybe bugs in RMAppimpl, or may be bugs in TestRMAppTransitions.
is it should  be better to defer that to another JIRA?

Or can we just ignore the event without test, just as [link YARN-4598|https://issues.apache.org/jira/browse/YARN-4598]



was (Author: xiaoheipangzi):
After reading Jason Lowe useful suggestion. I change the unit test and attach the new patch
.
In this patch ,I do three  three things: 1. add empty protected methodon:onInvalidStateTransition,
and add its callsite in the code block that RMAppImpl handle InvalidStateTransition2.create
a new final class RMAppImplForTest which override onInvalidStateTransition.In createNewTestApp,
create RMAppImplForTest  object instead of  RMAppImpl. 3. fix this bug by ignore the event

But there are another two InvalidStateTransition while testing:1.testAppAcceptedFailed:APP_ACCEPTED
at state 2.testAppRunningFailed:ACCEPTED,APP_UPDATE_SAVED at state KILLED.
These two InvalidStateTransition maybe bugs in RMAppimpl, or may be bugs in TestRMAppTransitions.
ishould  be better to defer that to another JIRA?

Or can we just ignore the event without test, just as [link YARN-4598|https://issues.apache.org/jira/browse/YARN-4598]


> RMAppImpl:Invalid event: START at KILLED
> ----------------------------------------
>
>                 Key: YARN-7663
>                 URL: https://issues.apache.org/jira/browse/YARN-7663
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: lujie
>            Assignee: lujie
>            Priority: Minor
>              Labels: patch
>         Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch
>
>
> Send kill to application, the RM log shows:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: START at
KILLED
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> if insert sleep before where the START event was created, this bug will deterministically
reproduce. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message