hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-369) Handle ( or throw a proper error when receiving) status updates from application masters that have not registered
Date Mon, 04 Mar 2013 19:57:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592567#comment-13592567
] 

Bikas Saha commented on YARN-369:
---------------------------------

The RM already verifies that the app attempt is valid. This is done via the responseMap that
sounds similar to the map you propose. This map gets populated when the attempt is created
and so the RM ApplicationMasterService is informed that the new app attempt is the official
one. Look at ApplicationMasterService.registerAppAttempt().
Given the current state of the code, the simplest solution would be to set the responseId
in ApplicationMasterService.registerAppAttempt() to Integer.MIN (-ve number). And then in
registerApplicationMaster, set the responseId of lastResponse to 0 because after that the
application can start issuing allocate request. If the app does allocate before register then
the existing checks in allocate() will fail and we will be safe.
Would be great to add a test for this basic functionality.
                
> Handle ( or throw a proper error when receiving) status updates from application masters
that have not registered
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-369
>                 URL: https://issues.apache.org/jira/browse/YARN-369
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Hitesh Shah
>            Assignee: Abhishek Kapoor
>
> Currently, an allocate call from an unregistered application is allowed and the status
update for it throws a statemachine error that is silently dropped.
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: STATUS_UPDATE
at LAUNCHED
>        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:588)
>        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:99)
>        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:471)
>        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:452)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>        at java.lang.Thread.run(Thread.java:680)
> ApplicationMasterService should likely throw an appropriate error for applications' requests
that should not be handled in such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message