hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1694) RM is shutting down when an NM is added to cluster without updating the hostname in /etc/hosts
Date Fri, 07 Feb 2014 15:13:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894614#comment-13894614
] 

Jason Lowe commented on YARN-1694:
----------------------------------

This appears to be a duplicate of YARN-713.

> RM is shutting down when an NM is added to cluster without updating the hostname in /etc/hosts
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-1694
>                 URL: https://issues.apache.org/jira/browse/YARN-1694
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Sunil G
>            Priority: Critical
>
> A New NM is added to cluster, but the hostname mapping of this NM is not updated in /etc/hosts
in RM.
> NM registration is successful without any problems.
> When a job is submitted, RM shuts down with below exception.
> 2013-10-04 04:37:37,611 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager:
Error in handling event type NODE_UPDATE to the scheduler
> java.lang.IllegalArgumentException: java.net.UnknownHostException: host-10-18-40-120
>         at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
>         at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
>         at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1296)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1344)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1210)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1169)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:870)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:645)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:707)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:751)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:93)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:449)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.UnknownHostException: host-10-18-40-120
>         ... 15 more
> 2013-10-04 04:37:37,614 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager:
Exiting, bbye..



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message