hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject ResourceManager shutting down
Date Thu, 13 Mar 2014 20:51:32 GMT
We have this erratic behavior where every so often the RM will shutdown with an UnknownHostException.
 The odd thing is, the host it complains about have been in use for days at that point without
problem.  Any ideas?
Thanks,
John


2014-03-13 14:38:14,746 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(578)) - application_1394204725813_0220
State change from ACCEPTED to RUNNING
2014-03-13 14:38:15,794 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(449))
- Error in handling event type NODE_UPDATE to the scheduler
java.lang.IllegalArgumentException: java.net.UnknownHostException: skitzo.office.datalever.com
        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
        at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
        at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1297)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1345)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1211)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1170)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:871)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:645)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:690)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:734)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:86)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.UnknownHostException: skitzo.office.datalever.com
        ... 15 more
2014-03-13 14:38:15,794 INFO  resourcemanager.ResourceManager (ResourceManager.java:run(453))
- Exiting, bbye..
2014-03-13 14:38:15,911 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped SelectChannelConnector@metallica.office.datalever.com:8088
2014-03-13 14:38:16,013 ERROR delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:run(557))
- InterruptedExcpetion recieved for ExpiredTokenRemover thread java.lang.InterruptedException:
sleep interrupted
2014-03-13 14:38:16,013 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(200)) -
Stopping ResourceManager metrics system...
2014-03-13 14:38:16,014 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(206)) -
ResourceManager metrics system stopped.
2014-03-13 14:38:16,014 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(572))
- ResourceManager metrics system shutdown complete.
2014-03-13 14:38:16,015 WARN  amlauncher.ApplicationMasterLauncher (ApplicationMasterLauncher.java:run(98))
- org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread
interrupted. Returning.
2014-03-13 14:38:16,015 INFO  ipc.Server (Server.java:stop(2442)) - Stopping server on 8141
2014-03-13 14:38:16,017 INFO  ipc.Server (Server.java:stop(2442)) - Stopping server on 8050
... and so on, it shuts down


Mime
View raw message