hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1444) RM crashes when node resource request sent without corresponding off-switch request
Date Thu, 13 Mar 2014 11:07:45 GMT

    [ https://issues.apache.org/jira/browse/YARN-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933102#comment-13933102
] 

Hudson commented on YARN-1444:
------------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/508/])
YARN-1444. Fix CapacityScheduler to deal with cases where applications specify host/rack requests
without off-switch request. Contributed by Wangda Tan. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576751)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


> RM crashes when node resource request sent without corresponding off-switch request
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-1444
>                 URL: https://issues.apache.org/jira/browse/YARN-1444
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client, resourcemanager
>            Reporter: Robert Grandl
>            Assignee: Wangda Tan
>            Priority: Blocker
>             Fix For: 2.4.0
>
>         Attachments: yarn-1444.ver1.patch, yarn-1444.ver2.patch
>
>
> I have tried to force reducers to execute on certain nodes. What I did is I changed for
reduce tasks, the RMContainerRequestor#addResourceRequest(req.priority, ResourceRequest.ANY,
req.capability) to RMContainerRequestor#addResourceRequest(req.priority, HOST_NAME, req.capability).

> However, this change lead to RM crashes when reducers needs to be assigned with the following
exception:
> FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling
event type NODE_UPDATE to the scheduler
> java.lang.NullPointerException
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:841)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:640)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:554)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:695)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:739)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:86)
>     at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:549)
>     at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message