hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Welch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2848) (FICA) Applications should maintain an application specific 'cluster' resource to calculate headroom and userlimit
Date Fri, 01 May 2015 18:41:07 GMT

    [ https://issues.apache.org/jira/browse/YARN-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523629#comment-14523629

Craig Welch commented on YARN-2848:

The ResourceUsage functionality added in [YARN-3356] [YARN-3099] and [YARN-3092] is effectively
an implementation of the approach suggested here, & was also used for [YARN-3463].  Given
that, I'm going to close this one.  While it's not yet been used to address the blacklist
issue with headroom [YARN-1680], that should be handled there in any case.

> (FICA) Applications should maintain an application specific 'cluster' resource to calculate
headroom and userlimit
> ------------------------------------------------------------------------------------------------------------------
>                 Key: YARN-2848
>                 URL: https://issues.apache.org/jira/browse/YARN-2848
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>            Reporter: Craig Welch
>            Assignee: Craig Welch
> Likely solutions to [YARN-1680] (properly handling node and rack blacklisting with cluster
level node additions and removals) will entail managing an application-level "slice" of the
cluster resource available to the application for use in accurately calculating the application
headroom and user limit.  There is an assumption that events which impact this resource will
occur less frequently than the need to calculate headroom, userlimit, etc (which is a valid
assumption given that occurs per-allocation heartbeat).  Given that, the application should
(with assistance from cluster-level code...) detect changes to the composition of the cluster
(node addition, removal) and when those have occurred, calculate an application specific cluster
resource by comparing cluster nodes to it's own blacklist (both rack and individual node).
 I think it makes sense to include nodelabel considerations into this calculation as it will
be efficient to do both at the same time and the single resource value reflecting both constraints
could then be used for efficient frequent headroom and userlimit calculations while remaining
highly accurate.  The application would need to be made aware of nodelabel changes it is interested
in (the application or removal of labels of interest to the application to/from nodes).  For
this purpose, the application submissions's nodelabel expression would be used to determine
the nodelabel impact on the resource used to calculate userlimit and headroom (Cases where
the application elected to request resources not using the application level label expression
are out of scope for this - but for the common usecase of an application which uses a particular
expression throughout, userlimit and headroom would be accurate) This could also provide an
overall mechanism for handling application-specific resource constraints which might be added
in the future.

This message was sent by Atlassian JIRA

View raw message