Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1BB74FB8C for ; Wed, 1 May 2013 01:41:15 +0000 (UTC) Received: (qmail 8958 invoked by uid 500); 1 May 2013 01:41:15 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 8913 invoked by uid 500); 1 May 2013 01:41:15 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 8904 invoked by uid 99); 1 May 2013 01:41:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 May 2013 01:41:14 +0000 Date: Wed, 1 May 2013 01:41:14 +0000 (UTC) From: "Bikas Saha (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646282#comment-13646282 ] Bikas Saha commented on YARN-582: --------------------------------- Why is this inside the try instead of where the existing fields are set. Isnt is safe to pass null to setAppAttemptTokens? {code} try { + if(appAttemptTokens != null){ + attemptStateData.setAppAttemptTokens(appAttemptTokens); + } {code} Why create new token secret manager for every generateToken() call? {code} + private ByteBuffer generateTokens(ApplicationAttemptId attemptId, + Configuration conf) { + ApplicationTokenSecretManager appTokenMgr = + new ApplicationTokenSecretManager(conf); + ApplicationTokenIdentifier appTokenId = + new ApplicationTokenIdentifier(attemptId); {code} This check should be performed after restart also. {code} + // assert application Token is saved + Assert.assertEquals(attempt1Token, attemptState.getAppAttemptTokens()); {code} This check should be performed before restart also since we changed this code path. {code} + // assert ApplicationTokenSecretManager has the password populated + Assert.assertTrue(rm2.getApplicationTokenSecretManager().hasPassword( + newAttempt.getAppAttemptId())); {code} This is wrong because the new app should be creating its own tokens. {code} } + app.createNewAttempt(true); + break; + case RECOVER: + RMAppAttempt attempt = app.createNewAttempt(true); + + // reuse the appToken from previous attempt + if (UserGroupInformation.isSecurityEnabled()) { + ApplicationAttemptId previousAttempt = + Records.newRecord(ApplicationAttemptId.class); + previousAttempt.setApplicationId(app.getApplicationId()); + previousAttempt.setAttemptId(app.getAppAttempts().size() - 1); + ApplicationState appState = app.getRMState().getApplicationState() + .get(app.getApplicationId()); + ApplicationAttemptState attemptState = + appState.getAttempt(previousAttempt); + assert attemptState != null; + ((RMAppAttemptImpl) attempt).recoverAppAttemptTokens(attemptState + .getAppAttemptTokens()); + } + break; + default: {code} Hence this is wrong {code} + // assert the new Attempt id is the same as the desired new attempt id + Assert.assertEquals(desiredNewAttemptId, newAttempt.getAppAttemptId()); + + // assert new attempt reuse previous attempt tokens + Assert.assertEquals(attempt1Token, newAttempt.getAppAttemptTokens()); {code} Need to check for securityEnabled when recovering tokens and populating the secret manager? Can we move token creation from constructor of RMAppAttemptImpl to AttemptStartedTransition? That way we will not end up creating new tokens in constructor and overriding them in recover(). Also in recover(), lets just populate the tokens but not add them to the secret managers. Later in work preserving restart we need to create a NEW->RUNNING transition in which the restored tokens will be added to the secret manager. Things to check 1) Why is this code in FinalTransition and not BaseFinalTransition? {code} // Unregister from the ClientTokenSecretManager if (UserGroupInformation.isSecurityEnabled()) { appAttempt.rmContext.getClientToAMTokenSecretManager() .unRegisterApplication(appAttempt.getAppAttemptId()); } {code} 2) why is this duplicated in both BaseFinalTransition and AMUnregisteredTransition? {code} // Remove the AppAttempt from the ApplicationTokenSecretManager appAttempt.rmContext.getApplicationTokenSecretManager() .applicationMasterFinished(appAttemptId); {code} > Restore appToken for app attempt after RM restart > ------------------------------------------------- > > Key: YARN-582 > URL: https://issues.apache.org/jira/browse/YARN-582 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Bikas Saha > Assignee: Jian He > Attachments: YARN-582.1.patch, YARN-582.2.patch > > > These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira