hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2487) Need to support timeout of AM When no containers are assigned to it for a defined period
Date Fri, 25 Sep 2015 04:09:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907533#comment-14907533
] 

Rohith Sharma K S commented on YARN-2487:
-----------------------------------------

Hi [~Naganarasimha Garla], it is worth for keeping the application if it is running. But problem
is currently YARN does not identifies the reasons for the not progressing. App not progressing
could be because of several reasons. So I feel, if any mechanism to get reason for not progressing
applications, this could be handled. I believe, YARN-4091 is one such issue which trying to
get more debug information and  planning to expose REST interface for getting per application
progress information.

> Need to support timeout of AM When no containers are assigned to it for a defined period
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-2487
>                 URL: https://issues.apache.org/jira/browse/YARN-2487
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>
>  There are some scenarios where AM will not get containers and indefinitely waiting.
We faced one such sceanrio which makes the applications to get hung : 
> Consider a cluster setup which has 2 NMS of each 8GB resource,
> And 2 applications(MR2) are launched in the default queue where in each AM is taking
2 GB each.
> Each AM is placed in each of the NM. Now each AM is requesting for container of 7Gb 
mem resource .
> As in each NM only 6GB resource is available both the applications are hung forever.
> To avoid such scenarios i would like to propose 
> generic timeout feature for all AM's in yarn, such that if no containers are assigned
for an application for a defined period than yarn can timeout the application attempt.
> Default can be set to 0 where in RM will not timeout the app attempt and user can set
his own timeout when he submits the application



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message