hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
Date Mon, 16 Jun 2014 14:38:01 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032473#comment-14032473
] 

Jason Lowe commented on MAPREDUCE-5928:
---------------------------------------

This sounds like a bug in either headroom calculation or in RMContainerAllocator where the
AM decides whether to preempt reducers.  Could you look in the AM log and see what it saw
for the headroom and whether it made any attempt at all to ramp down reducers?

> Deadlock allocating containers for mappers and reducers
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-5928
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>         Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
>            Reporter: Niels Basjes
>         Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg
>
>
> I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers).
> Due to the small memory of these systems I configured yarn as follows:
> {quote}
> yarn.nodemanager.resource.memory-mb = 2200
> yarn.scheduler.minimum-allocation-mb = 250
> {quote}
> On my client I did
> {quote}
> mapreduce.map.memory.mb = 512
> mapreduce.reduce.memory.mb = 512
> {quote}
> Now I run a job with 27 mappers and 32 reducers.
> After a while I saw this deadlock occur:
> -	All nodes had been filled to their maximum capacity with reducers.
> -	1 Mapper was waiting for a container slot to start in.
> I tried killing reducer attempts but that didn't help (new reducer attempts simply took
the existing container).
> *Workaround*:
> I set this value from my job. The default value is 0.05 (= 5%)
> {quote}
> mapreduce.job.reduce.slowstart.completedmaps = 0.99f
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message