hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
Date Wed, 30 Apr 2014 14:29:17 GMT

    [ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985575#comment-13985575

Jason Lowe commented on YARN-2005:

This is particularly helpful on a busy cluster where one node happens to be in a state where
it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state.  In
that scenario the only place with spare capacity is a node that fails every container attempt,
and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't

In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios
where a particular app's AMs end up failing on certain nodes but other containers run just

> Blacklisting support for scheduling AMs
> ---------------------------------------
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
> It would be nice if the RM supported blacklisting a node for an AM launch after the same
node fails a configurable number of AM attempts.  This would be similar to the blacklisting
support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on
the RM side.

This message was sent by Atlassian JIRA

View raw message