Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Fri, 1 May 2015 21:33:07 +0000 (UTC)
From: "Jian He (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12691568.1390892439000.51704.1430515987316@Atlassian.JIRA>
In-Reply-To: <JIRA.12691568.1390892439000@Atlassian.JIRA>
References: <JIRA.12691568.1390892439000@Atlassian.JIRA>
 <JIRA.12691568.1390892439803@arcas>
Subject: [jira] [Commented] (YARN-1662) Capacity Scheduler reservation issue
 cause Job Hang
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523998#comment-14523998 ] 

Jian He commented on YARN-1662:
-------------------------------

Hi [~sunilg], YARN-1198 has fixed a number of headRoom issues to make sure the headroom is correct so that the reducer preemption will kick in correctly. In that case, this problem may be resolved ?

> Capacity Scheduler reservation issue cause Job Hang
> ---------------------------------------------------
>
>                 Key: YARN-1662
>                 URL: https://issues.apache.org/jira/browse/YARN-1662
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.2.0
>         Environment: Suse 11 SP1 + Linux
>            Reporter: Sunil G
>
> There are 2 node managers in my cluster.
> NM1 with 8GB
> NM2 with 8GB
> I am submitting a Job with below details:
> AM with 2GB
> Map needs 5GB
> Reducer needs 3GB
> slowstart is enabled with 0.5
> 10maps and 50reducers are assigned.
> 5maps are completed. Now few reducers got scheduled.
> Now NM1 has 2GB AM and 3Gb Reducer_1    [Used 5GB]
> NM2 has 3Gb Reducer_2			         [Used 3GB]
> A Map has now reserved(5GB) in NM1 which has only 3Gb free.
> It hangs forever.
> Potential issue is, reservation is now blocked in NM1 for a Map which needs 5GB.
> But the Reducer_1 hangs by waiting for few map ouputs.
> Reducer side preemption also not happened as few headroom is still available.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)