hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "tangshangwen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow
Date Thu, 18 Aug 2016 08:41:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426107#comment-15426107
] 

tangshangwen commented on YARN-5535:
------------------------------------

Thanks [~sunilg] for the comments. 
I think Removing RMDelegationToken and SequenceNumber may take a long time,lead to can't
handle other events
{code:title=ZKRMStateStore.java|borderStyle=solid}
  @Override
  protected synchronized void removeRMDelegationTokenState(
      RMDelegationTokenIdentifier rmDTIdentifier) throws Exception {
    String nodeRemovePath =
        getNodePath(delegationTokensRootPath, DELEGATION_TOKEN_PREFIX
            + rmDTIdentifier.getSequenceNumber());
    if (LOG.isDebugEnabled()) {
      LOG.debug("Removing RMDelegationToken_"
          + rmDTIdentifier.getSequenceNumber());
    }
    if (existsWithRetries(nodeRemovePath, false) != null) {
      ArrayList<Op> opList = new ArrayList<Op>();
      opList.add(Op.delete(nodeRemovePath, -1));
      doDeleteMultiWithRetries(opList);
    } else {
      LOG.debug("Attempted to delete a non-existing znode " + nodeRemovePath);
    }
  }
{code}

> Remove RMDelegationToken make resourcemanager recovery very slow
> ----------------------------------------------------------------
>
>                 Key: YARN-5535
>                 URL: https://issues.apache.org/jira/browse/YARN-5535
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: tangshangwen
>            Assignee: tangshangwen
>
> In our cluster, I found that when restart RM, RM recovery is very slow, this is my
log
> {noformat}
> [2016-08-12T19:43:21.478+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737879
> [2016-08-12T19:43:21.478+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.486+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737878
> [2016-08-12T19:43:21.486+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.494+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737877
> [2016-08-12T19:43:21.494+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.503+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737876
> [2016-08-12T19:43:21.503+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.519+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737875
> [2016-08-12T19:43:21.519+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.533+08:00] [INFO] security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148)
[Socket Reader #1 for port 8031] : Authorization successful for yarn (auth:SIMPLE) for protocol=interface
org.apache.hadoop.yarn.server.api.ResourceTrackerPB
> [2016-08-12T19:43:21.536+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737874
> [2016-08-12T19:43:21.536+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.553+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737873
> [2016-08-12T19:43:21.553+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.568+08:00] [INFO] yarn.util.RackResolver.coreResolve(RackResolver.java:109)
[IPC Server handler 0 on 8031] : Resolved xxxx-7056.hadoop.xxx.local to /rack/rack5118
> [2016-08-12T19:43:21.569+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737872
> [2016-08-12T19:43:21.569+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.570+08:00] [INFO] server.resourcemanager.ResourceTrackerService.registerNodeManager(ResourceTrackerService.java:343)
[IPC Server handler 0 on 8031] : NodeManager from node xxxxx-7056.hadoop.xxx.local(cmPort:
50086 httpPort: 8042) registered with capability: <memory:57344, vCores:28>, assigned
nodeId xxxxxx-7056.hadoop.xxx.local:50086
> [2016-08-12T19:43:21.572+08:00] [INFO] resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:424)
[AsyncDispatcher event handler] : xxxxxx-7056.hadoop.xxx.local:50086 Node Transitioned from
NEW to RUNNING
> [2016-08-12T19:43:21.576+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 1000
> [2016-08-12T19:43:21.577+08:00] [INFO] scheduler.fair.FairScheduler.addNode(FairScheduler.java:840)
[ResourceManager Event Processor] : Added node xxxxxxx-7056.hadoop.xxx.local:50086 cluster
capacity: <memory:57344, vCores:28>
> [2016-08-12T19:43:21.577+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 2000
> [2016-08-12T19:43:21.578+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737870
> [2016-08-12T19:43:21.578+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.579+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 3000
> [2016-08-12T19:43:21.580+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 4000
> [2016-08-12T19:43:21.582+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 5000
> [2016-08-12T19:43:21.583+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 6000
> [2016-08-12T19:43:21.585+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 7000
> [2016-08-12T19:43:21.586+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 8000
> [2016-08-12T19:43:21.586+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737871
> [2016-08-12T19:43:21.587+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.587+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 9000
> [2016-08-12T19:43:21.589+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235)
[AsyncDispatcher event handler] : Size of event-queue is 10000
> [2016-08-12T19:43:21.595+08:00] [INFO] resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
[Thread[Thread-26,5,main]] : removing RMDelegation token with sequence number: 737868
> [2016-08-12T19:43:21.595+08:00] [INFO] resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320)
[Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message