hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6541) Exclude pending reducer memory when calculating available mapper slots from headroom to avoid deadlock
Date Fri, 06 Nov 2015 23:54:11 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994732#comment-14994732

Varun Saxena commented on MAPREDUCE-6541:

Assigned it to myself as I am working on 2 similar JIRAs'.
Wangda kindly reassign if you will be working on it.

Haven't thought through all the cases but on the face of it, this makes sense. Because reducers
will have higher priority so whether mapper has enough headroom needs to take into account
that pending reducers will be assigned resources before it. Which will be done if we exclude
pending reducers resources. If I am not wrong you are talking about below condition in RMContainerAllocator#preemptReducesIfNeeded
    // The pending mappers haven't been waiting for too long. Let us see if
    // the headroom can fit a mapper.
    Resource availableResourceForMap = getAvailableResources();
    if (ResourceCalculatorUtils.computeAvailableContainers(availableResourceForMap,
        mapResourceRequest, getSchedulerResourceTypes()) > 0) {
      // the available headroom is enough to run a mapper

Coming to MAPREDUCE-6514, in addition to updating ask, it will also consider whether to ramp
up reduces or not if maps are hanging in scheduleReduces(). Will update JIRA description accordingly.

> Exclude pending reducer memory when calculating available mapper slots from headroom
to avoid deadlock 
> -------------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-6541
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6541
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Wangda Tan
>            Assignee: Varun Saxena
> We saw a MR deadlock recently:
> - When NM restarted by framework without enable recovery, containers running on these
nodes will be identified as "ABORTED", and MR AM will try to reschedule "ABORTED" mapper containers.
> - Since such lost mappers are "ABORTED" container, MR AM gives normal mapper priority
(priority=20) to such mapper requests. If there's any pending reducer (priority=10) at the
same time, mapper requests need to wait for reducer requests satisfied.
> - In our test, one mapper needs 700+ MB, reducer needs 1000+ MB, and RM available resource
= mapper-request = (700+ MB), only one job was running in the system so scheduler cannot allocate
more reducer containers AND MR-AM thinks there're enough headroom for mapper so reducer containers
will not be preempted.
> MAPREDUCE-6302 can solve most of the problems, but in the other hand, I think we may
need to exclude pending reducers resource when calculating #available-mapper-slots from headroom.
Which we can avoid excessive reducer preemption.

This message was sent by Atlassian JIRA

View raw message