hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
Date Sun, 22 Dec 2013 01:44:50 GMT

    [ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855051#comment-13855051
] 

Xuan Gong commented on YARN-1410:
---------------------------------

This is how this patch works:
1. create a new Exception called ApplicationIdNotAssignedByCurrentRMException (Better name
needed).
2. When do the submission, The RM will check whether this applicationId is assigned by it.
If yes, it will continue, otherwise, it will throw out  NotCurrentRMException.
3. At YarnClientImpl#submitApplication(), if it got this NotCurrentRMException, it will get
new applicationId from current active RM, and re-submit it.

Also, in this patch, I explicit add Idempotent annotation to submitApplication and getApplicationReport
in ApplicationClientProtocol. I think that those are enough for this scenario. Whether we
need to explicit add Idempotent annotation to other methods for other protocols will be tracked
in https://issues.apache.org/jira/browse/YARN-1521 


> Handle client failover during 2 step client API's like app submission
> ---------------------------------------------------------------------
>
>                 Key: YARN-1410
>                 URL: https://issues.apache.org/jira/browse/YARN-1410
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed over, and the
client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create app id)
the new RM may reject the app submission resulting in unexpected failure on the client side.
> The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message