hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
Date Thu, 02 Jan 2014 22:43:52 GMT

    [ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860887#comment-13860887

Xuan Gong commented on YARN-1410:

Thanks for the comments. [~bikassaha], [~kkambatl]

bq.But I think we have a separate createApplication() in order to get an appId for which to
request RM tokens so that those tokens can be inserted in the AppSubmitContext before app

Looks like that requesting RM tokens request does not need appId. But I agree that we still
need createApplication(). By using this function, it can give us a global unique Id, we can
use this Id to do several things, such as create JobId for mapreduce job, use it as part of
Path to set up the local resource. 

For the solution of this ticket, I think the better way is to ask client to alway use the
submitApplication() (adding comments on yarnClient api). In submitApplication(), we can check
whether the appid is provided from ASC or not, if it does, we can use this appid (of course,
need to check whether this appid is provided by current active rm or not) to submit the application.
If not, we can ask one, then do the submission. 

> Handle client failover during 2 step client API's like app submission
> ---------------------------------------------------------------------
>                 Key: YARN-1410
>                 URL: https://issues.apache.org/jira/browse/YARN-1410
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-1410.1.patch
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed over, and the
client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create app id)
the new RM may reject the app submission resulting in unexpected failure on the client side.
> The same may happen for other 2 step client API operations.

This message was sent by Atlassian JIRA

View raw message