hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-515) Node Manager not getting the master key
Date Thu, 28 Mar 2013 21:55:15 GMT

    [ https://issues.apache.org/jira/browse/YARN-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616698#comment-13616698

Robert Joseph Evans commented on YARN-515:

This is really odd.  I put in logging in the ResourceTrackerService and in the NodeStatusUpdaterImpl.
 The RM sets the secret key in the RegisterNodeManagerResponse, but the NM only sees a null
come out for it.  Because of that the heartbeat always fails with the NPE trying to read something
that was never set.
> Node Manager not getting the master key
> ---------------------------------------
>                 Key: YARN-515
>                 URL: https://issues.apache.org/jira/browse/YARN-515
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.0.4-alpha
>            Reporter: Robert Joseph Evans
>            Priority: Blocker
> On branch-2 the latest version I see the following on a secure cluster.
> {noformat}
> 2013-03-28 19:21:06,243 [main] INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Security enabled - updating secret keys now
> 2013-03-28 19:21:06,243 [main] INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Registered with ResourceManager as RM:PORT with total resource of <me
> mory:12288, vCores:16>
> 2013-03-28 19:21:06,244 [main] INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl
is started.
> 2013-03-28 19:21:06,245 [main] INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager
is started.
> 2013-03-28 19:21:07,257 [Node Status Updater] ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Caught exception in status-updater
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.security.BaseContainerTokenSecretManager.getCurrentKey(BaseContainerTokenSecretManager.java:121)
>         at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:407)
> {noformat}
> The Null pointer exception just keeps repeating and all of the nodes end up being lost.
 It looks like it never gets the secret key when it registers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message