hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1410) Handle RM fails over after getApplicationID() and before submitApplication().
Date Fri, 07 Mar 2014 02:19:43 GMT

    [ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923434#comment-13923434

Vinod Kumar Vavilapalli commented on YARN-1410:

Okay, I went back and reread the thing. It seems like we diverged off again. The approach
in the latest patch seems like it isn't the same as what Bikas and you agreed upon. Is that
true? [~bikassaha], can you confirm if it is fine? We now blindly accepts appIDs generated
by previous RM. Clearly, there are possibilities of malicious users generating appIDs (which
exists today) - but there are a couple of ways in which we can fix that.

Originally, it was also suggested that we add app-ID to the SubmitResponse - which we aren't
doing anymore as we blindly accept IDs from previous RMs now in the latest patch. Is that

> Handle RM fails over after getApplicationID() and before submitApplication().
> -----------------------------------------------------------------------------
>                 Key: YARN-1410
>                 URL: https://issues.apache.org/jira/browse/YARN-1410
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, YARN-1410.10.patch,
YARN-1410.10.patch, YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch,
YARN-1410.5.patch, YARN-1410.6.patch, YARN-1410.7.patch, YARN-1410.8.patch, YARN-1410.9.patch
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed over, and the
client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create app id)
the new RM may reject the app submission resulting in unexpected failure on the client side.
> The same may happen for other 2 step client API operations.

This message was sent by Atlassian JIRA

View raw message