hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Dhoot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart
Date Fri, 23 May 2014 22:00:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007768#comment-14007768
] 

Anubhav Dhoot commented on YARN-1365:
-------------------------------------

The error is RMAppRecoveredTransition leaves it in LAUNCHED and then scheduler executes ATTEMPT_ADDED.
I see Jian fixed it in a certain way in YARN-1368. But that only addresses it if its in LAUNCHED.
If the state reaches RUNNING before that we still get the error. The option is see is we pass
in a flag to AppAttemptAddedSchedulerEvent that tells scheduler not to issue ATTEMPT_ADDED.
This will be set in RMAppRecoveredTransition. Lemme know what you think

> ApplicationMasterService to allow Register and Unregister of an app that was running
before restart
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1365
>                 URL: https://issues.apache.org/jira/browse/YARN-1365
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1365.001.patch, YARN-1365.002.patch, YARN-1365.003.patch, YARN-1365.initial.patch
>
>
> For an application that was running before restart, the ApplicationMasterService currently
throws an exception when the app tries to make the initial register or final unregister call.
These should succeed and the RMApp state machine should transition to completed like normal.
Unregistration should succeed for an app that the RM considers complete since the RM may have
died after saving completion in the store but before notifying the AM that the AM is free
to exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message