hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Radwan (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-2788) LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0
Date Fri, 14 Oct 2011 19:04:11 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ahmed Radwan updated MAPREDUCE-2788:

    Attachment: MAPREDUCE-2788_rev3.patch

Thanks Arun for your comments, I have looked into normalizing the requests within the CapacityScheduler.

It doesn't seem that the call to LeafQueue.assignContainer(..) come via CapacityScheduler.allocate().
It gets called through the call path:

LeafQueue.assignContainer(..) <- assignNodeLocalContainers(..) <-- LeafQueue.assignContainersOnNode(..)
<- LeafQueue.assignContainers(..)

There are alternative paths, but all lead to the same source.

The SchedulerApp application (in the LeafQueue.assignContainers(..) call) is one of the Map<ApplicationAttemptId,
SchedulerApp> applicationsMap values. This applicationsMap is only populated through LeafQueue.addApplication(..).

The LeafQueue.addApplication(..) is called  through the path: LeafQueue.addApplication(..)
<- LeafQueue.submitApplication(..) <- CapacityScheduler.addApplication(..).

So I have added code to CapacityScheduler.addApplication(..) to normalize all resource requests
for the SchedulerApp before submitting to the queue.

If the LeafQueue is interminably tied to CS, we may need to update the references in LeafQueue
to use CapacityScheduler instead of CapacitySchedulerContext, this will make such dependency
clear and avoid future confusions. I haven't made this interface change in the attached patch,
as it requires more changes to other components, but if we agree about it, I can do it in
a following issue.
> LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory()
== 0
> -----------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-2788
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2788
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Ahmed Radwan
>            Assignee: Ahmed Radwan
>            Priority: Critical
>         Attachments: MAPREDUCE-2788.patch, MAPREDUCE-2788_rev2.patch, MAPREDUCE-2788_rev3.patch
> The assignContainer() method in org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
can cause the scheduler to crash if the ResourseRequest capability memory == 0 (divide by

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message