hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2884) Proxying all AM-RM communications
Date Thu, 27 Aug 2015 06:24:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716153#comment-14716153

Jian He commented on YARN-2884:

Looks good to me overall, I think there are still some problems with the AMRMProxyToken implementation.
Basically, long running service may not work with the AMRMProxy.

1) below code in DefaultRequestInterceptor should create and return a new AMRMProxyToken in
the final returned allocate response when needed. Otherwise, AM will fail to talk with AMRMTokenProxy
after the key is rolled over in the AMRMTokenProxySecretManager. 
  public AllocateResponse allocate(AllocateRequest request)
      throws YarnException, IOException {
    if (LOG.isDebugEnabled()) {
      LOG.debug("Forwarding allocate request to the real YARN RM");
    AllocateResponse allocateResponse = rmClient.allocate(request);
    if (allocateResponse.getAMRMToken() != null) {
    return allocateResponse; <====
 Below code in ApplicationMasterService#allocate shows how that is done.
      if (nextMasterKey != null
          && nextMasterKey.getMasterKey().getKeyId() != amrmTokenIdentifier
            .getKeyId()) {
        RMAppAttemptImpl appAttemptImpl = (RMAppAttemptImpl)appAttempt;
        Token<AMRMTokenIdentifier> amrmToken = appAttempt.getAMRMToken();
        if (nextMasterKey.getMasterKey().getKeyId() !=
            appAttemptImpl.getAMRMTokenKeyId()) {
          LOG.info("The AMRMToken has been rolled-over. Send new AMRMToken back"
              + " to application: " + applicationId);
          amrmToken = rmContext.getAMRMTokenSecretManager()
          .newInstance(amrmToken.getIdentifier(), amrmToken.getKind()
            .toString(), amrmToken.getPassword(), amrmToken.getService()
2)  Some methods inside the AMRMProxyTokenSecretManager are not used at all. we may remove
them ?

3) I think we need at least 1 end-to-end test for this. We can use MiniYarnCluster to simulate
the whole thing. AM  talks with AMRMProxy which  talks with RM to register/allocate/finish.
In the test, we should also reduce the RM_AMRM_TOKEN_MASTER_KEY_ROLLING_INTERVAL_SECS so that
we can simulate the token renew behavior.  I'm ok to have a separate jira to track the end-to-end
test, as this is a bit of work.

> Proxying all AM-RM communications
> ---------------------------------
>                 Key: YARN-2884
>                 URL: https://issues.apache.org/jira/browse/YARN-2884
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Kishore Chaliparambil
>         Attachments: YARN-2884-V1.patch, YARN-2884-V10.patch, YARN-2884-V11.patch, YARN-2884-V2.patch,
YARN-2884-V3.patch, YARN-2884-V4.patch, YARN-2884-V5.patch, YARN-2884-V6.patch, YARN-2884-V7.patch,
YARN-2884-V8.patch, YARN-2884-V9.patch
> We introduce the notion of an RMProxy, running on each node (or once per rack). Upon
start the AM is forced (via tokens and configuration) to direct all its requests to a new
services running on the NM that provide a proxy to the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs

This message was sent by Atlassian JIRA

View raw message