hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart
Date Wed, 01 May 2013 01:41:14 GMT

    [ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646282#comment-13646282

Bikas Saha commented on YARN-582:

Why is this inside the try instead of where the existing fields are set. 
Isnt is safe to pass null to setAppAttemptTokens?
           try {
+            if(appAttemptTokens != null){
+              attemptStateData.setAppAttemptTokens(appAttemptTokens);
+            }

Why create new token secret manager for every generateToken() call?
+  private ByteBuffer generateTokens(ApplicationAttemptId attemptId,
+      Configuration conf) {
+    ApplicationTokenSecretManager appTokenMgr =
+        new ApplicationTokenSecretManager(conf);
+    ApplicationTokenIdentifier appTokenId =
+        new ApplicationTokenIdentifier(attemptId);

This check should be performed after restart also.
+    // assert application Token is saved
+    Assert.assertEquals(attempt1Token, attemptState.getAppAttemptTokens());

This check should be performed before restart also since we changed this code path.
+    // assert ApplicationTokenSecretManager has the password populated
+    Assert.assertTrue(rm2.getApplicationTokenSecretManager().hasPassword(
+      newAttempt.getAppAttemptId()));

This is wrong because the new app should be creating its own tokens.
+        app.createNewAttempt(true);
+        break;
+      case RECOVER:
+        RMAppAttempt attempt = app.createNewAttempt(true);
+        // reuse the appToken from previous attempt
+        if (UserGroupInformation.isSecurityEnabled()) {
+          ApplicationAttemptId previousAttempt =
+              Records.newRecord(ApplicationAttemptId.class);
+          previousAttempt.setApplicationId(app.getApplicationId());
+          previousAttempt.setAttemptId(app.getAppAttempts().size() - 1);
+          ApplicationState appState = app.getRMState().getApplicationState()
+              .get(app.getApplicationId());
+          ApplicationAttemptState attemptState =
+              appState.getAttempt(previousAttempt);
+          assert attemptState != null;
+          ((RMAppAttemptImpl) attempt).recoverAppAttemptTokens(attemptState
+            .getAppAttemptTokens());
+        }
+        break;
+      default:
Hence this is wrong
+    // assert the new Attempt id is the same as the desired new attempt id
+    Assert.assertEquals(desiredNewAttemptId, newAttempt.getAppAttemptId());
+    // assert new attempt reuse previous attempt tokens
+    Assert.assertEquals(attempt1Token, newAttempt.getAppAttemptTokens());

Need to check for securityEnabled when recovering tokens and populating the secret manager?

Can we move token creation from constructor of RMAppAttemptImpl to AttemptStartedTransition?
That way we will not end up creating new tokens in constructor and overriding them in recover().
Also in recover(), lets just populate the tokens but not add them to the secret managers.
Later in work preserving restart we need to create a NEW->RUNNING transition in which the
restored tokens will be added to the secret manager.

Things to check
1) Why is this code in FinalTransition and not BaseFinalTransition?
      // Unregister from the ClientTokenSecretManager
      if (UserGroupInformation.isSecurityEnabled()) {
2) why is this duplicated in both BaseFinalTransition and AMUnregisteredTransition?
      // Remove the AppAttempt from the ApplicationTokenSecretManager
> Restore appToken for app attempt after RM restart
> -------------------------------------------------
>                 Key: YARN-582
>                 URL: https://issues.apache.org/jira/browse/YARN-582
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Jian He
>         Attachments: YARN-582.1.patch, YARN-582.2.patch
> These need to be saved and restored on a per app attempt basis. This is required only
when work preserving restart is implemented for secure clusters. In non-preserving restart
app attempts are killed and so this does not matter.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message