hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
Date Wed, 30 Apr 2014 14:29:17 GMT

    [ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985575#comment-13985575
] 

Jason Lowe commented on YARN-2005:
----------------------------------

This is particularly helpful on a busy cluster where one node happens to be in a state where
it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state.  In
that scenario the only place with spare capacity is a node that fails every container attempt,
and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't
working.

In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios
where a particular app's AMs end up failing on certain nodes but other containers run just
fine.

> Blacklisting support for scheduling AMs
> ---------------------------------------
>
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>
> It would be nice if the RM supported blacklisting a node for an AM launch after the same
node fails a configurable number of AM attempts.  This would be similar to the blacklisting
support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on
the RM side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message