hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wilfred Spiegelenburg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM
Date Wed, 16 Jan 2019 02:00:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743551#comment-16743551
] 

Wilfred Spiegelenburg commented on YARN-9194:
---------------------------------------------

Changing it back to SCHEDULED looks much better. I have one question left on this one: the
change made does a check after we have already stored the container as the master container
on the attempt. This means that after we have done that although we have no container any
call to {{SchedulerApplicationAttempt.isWaitingForAMContainer()}} will still return that we
have an AM container allocated. This could affect scheduling.

Should the check not be done before we make any changes to the appAttempt?


> Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown
a NM
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-9194
>                 URL: https://issues.apache.org/jira/browse/YARN-9194
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: lujie
>            Assignee: lujie
>            Priority: Critical
>         Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, YARN-9194_4.patch,
hadoop-hires-resourcemanager-hadoop11.log
>
>
> While the attempt fails, the REGISTERED comes, hence the InvalidStateTransitionException
happens.
>  
> {code:java}
> 2019-01-13 00:41:57,127 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
App attempt: appattempt_1547311267249_0001_000002 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: REGISTERED
at FAILED
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913)
> at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121)
> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073)
> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message