hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lujie (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
Date Wed, 10 Jan 2018 03:04:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310909#comment-16310909
] 

lujie edited comment on YARN-7663 at 1/10/18 3:03 AM:
------------------------------------------------------

After reading Jason Lowe useful suggestion. I rewrite the unit test and attach the new patch
.
In this patch ,I do three  three things: 1. add empty protected method:onInvalidStateTransition,
and add its callsite in the code block that RMAppImpl handle InvalidStateTransition2.create
a new final class RMAppImplForTest which override onInvalidStateTransition.In createNewTestApp,
create RMAppImplForTest  object instead of  RMAppImpl. 3. fix this bug by ignore the event

But there are another two InvalidStateTransition while testing:1.testAppAcceptedFailed:APP_ACCEPTED
at state ACCEPTED 2.testAppRunningFailed:,APP_UPDATE_SAVED at state KILLED.
These two InvalidStateTransition maybe bugs in RMAppimpl, or may be bugs in TestRMAppTransitions.
is it should  be better to defer that to another JIRA?

Or can we just ignore the event without test, just as [YARN-4598|https://issues.apache.org/jira/browse/YARN-4598]



was (Author: xiaoheipangzi):
After reading Jason Lowe useful suggestion. I rewrite the unit test and attach the new patch
.
In this patch ,I do three  three things: 1. add empty protected method:onInvalidStateTransition,
and add its callsite in the code block that RMAppImpl handle InvalidStateTransition2.create
a new final class RMAppImplForTest which override onInvalidStateTransition.In createNewTestApp,
create RMAppImplForTest  object instead of  RMAppImpl. 3. fix this bug by ignore the event

But there are another two InvalidStateTransition while testing:1.testAppAcceptedFailed:APP_ACCEPTED
at state 2.testAppRunningFailed:ACCEPTED,APP_UPDATE_SAVED at state KILLED.
These two InvalidStateTransition maybe bugs in RMAppimpl, or may be bugs in TestRMAppTransitions.
is it should  be better to defer that to another JIRA?

Or can we just ignore the event without test, just as [YARN-4598|https://issues.apache.org/jira/browse/YARN-4598]


> RMAppImpl:Invalid event: START at KILLED
> ----------------------------------------
>
>                 Key: YARN-7663
>                 URL: https://issues.apache.org/jira/browse/YARN-7663
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: lujie
>            Assignee: lujie
>            Priority: Minor
>              Labels: patch
>             Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4
>
>         Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, YARN-7663_4.patch,
YARN-7663_5.patch, YARN-7663_6.patch, YARN-7663_7.patch
>
>
> Send kill to application, the RM log shows:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: START at
KILLED
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805)
>         at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> if insert sleep before where the START event was created, this bug will deterministically
reproduce. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message