hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
Date Mon, 29 Jun 2015 21:59:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606491#comment-14606491

Jason Lowe commented on YARN-2005:

Thanks for posting a patch, Anubhav!

I haven't done a very in-depth review, but here's a few comments/questions so far:

Why add an interface to the scheduler to get the number of nodes?  Is there a reason we can't
use ClusterMetrics.getNumActiveNMs?

The blacklist view of the cluster is static.  It takes a snapshot of the number of nodes and
doesn't update as nodes are added or removed from the cluster.  That's problematic if the
number of nodes changes drastically from one attempt to the next.  I'm thinking in particular
about recovery scenarios or something similar where we may create attempts when only a few
(possibly none?) of the nodes have registered.

> Blacklisting support for scheduling AMs
> ---------------------------------------
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-2005.001.patch, YARN-2005.002.patch
> It would be nice if the RM supported blacklisting a node for an AM launch after the same
node fails a configurable number of AM attempts.  This would be similar to the blacklisting
support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on
the RM side.

This message was sent by Atlassian JIRA

View raw message