hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
Date Thu, 29 Jan 2015 14:33:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296936#comment-14296936
] 

Sunil G commented on YARN-2005:
-------------------------------

bq.  I've seen use a different app name each time they submit
Yes. This is a real valid point in a production cluster. I hope the user will be same for
these, hence user can play a bigger role in identifying such problematic apps.
as mentioned in point 2, 
* failed app can be restricted to run in same node on a reattempt scenario
* If an app is failed for a given user, and same user submitting another application, its
good to scheduler that also in a different node. 

bq.but the biggest problem we're seeing probably doesn't need anything that fancy to solve
80% of the cases we see.
I agree that a simple logic can be shaped up first, and can see the feasibility. Then an analysis
on how much it can really help. After that its better to go for complexity. Kindly suggests
your thoughts on same.

> Blacklisting support for scheduling AMs
> ---------------------------------------
>
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>
> It would be nice if the RM supported blacklisting a node for an AM launch after the same
node fails a configurable number of AM attempts.  This would be similar to the blacklisting
support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on
the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message