hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niels Basjes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
Date Mon, 16 Jun 2014 15:36:04 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032564#comment-14032564
] 

Niels Basjes commented on MAPREDUCE-5928:
-----------------------------------------

I took the 'dead' node (node2) offline (completely stopped all hadoop/yarn related deamons)
and ran the same job again after it had disappeared in all overviews.
Now it does complete all mappers.

> Deadlock allocating containers for mappers and reducers
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-5928
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>         Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
>            Reporter: Niels Basjes
>         Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR
job stuck in deadlock.png.jpg
>
>
> I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers).
> Due to the small memory of these systems I configured yarn as follows:
> {quote}
> yarn.nodemanager.resource.memory-mb = 2200
> yarn.scheduler.minimum-allocation-mb = 250
> {quote}
> On my client I did
> {quote}
> mapreduce.map.memory.mb = 512
> mapreduce.reduce.memory.mb = 512
> {quote}
> Now I run a job with 27 mappers and 32 reducers.
> After a while I saw this deadlock occur:
> -	All nodes had been filled to their maximum capacity with reducers.
> -	1 Mapper was waiting for a container slot to start in.
> I tried killing reducer attempts but that didn't help (new reducer attempts simply took
the existing container).
> *Workaround*:
> I set this value from my job. The default value is 0.05 (= 5%)
> {quote}
> mapreduce.job.reduce.slowstart.completedmaps = 0.99f
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message