hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3530) Sometimes NODE_UPDATE to the scheduler throws an NPE causing the scheduling to stop
Date Wed, 14 Dec 2011 02:22:30 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169004#comment-13169004
] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3530:
----------------------------------------------------

bq. The RM should probably exit if the scheduler thread sees exceptions, instead of the RM
continuing to run without the scheduler thread.
Let's do that separately. We need this kind of checking for all components.
                
> Sometimes NODE_UPDATE to the scheduler throws an NPE causing the scheduling to stop
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3530
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3530
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, resourcemanager, scheduler
>    Affects Versions: 0.23.1
>            Reporter: Karam Singh
>            Assignee: Arun C Murthy
>            Priority: Blocker
>         Attachments: MAPREDUCE-3530.patch
>
>
> Sometimes NODE_UPDATE to the scheduler throws NPE causes scheduling to stop but ResourceManager
keeps on running.
> I have been observing intermitently for last 3 weeks.
> But with latest svn code. I tried to run sort twice and both times Job got stuck due
to NPE.
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.containerLaunchedOnNode(SchedulerApp.java:181)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.containerLaunchedOnNode(CapacityScheduler.java:596)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:539)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:617)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:77)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:294)
>         at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message