hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3409) Add constraint node labels
Date Thu, 01 Dec 2016 20:12:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712946#comment-15712946

Wangda Tan commented on YARN-3409:

Thanks all for discussion, sorry for the delayed response, I'm still on vacation and I should
be able to participate more discussions from next week.

[~Naganarasimha], I think POC will be helpful to understanding the scope, but from my POV,
the biggest challenge of this task is properly design API and deciding how to make it work
with existing features (like locality/partition) and future features (like YARN-4902 / global
scheduling, etc.)

Following are overall thoughts in my mind:

h3. 1. For the (old) ResourceRequest API, we have a couple of choices:

*Choice a.*
Use nodeLabelExpression to specify nodePartition and nodeConstraint altogether
- This has good semantic, since "constraint" could be considered as one special kind of "label"

- We have to add a new field for affinity/anti-affinity
- Some existing implementation assumes "nodeLabelExpression" equals to partition

*Choice b.*
Add a new field for constraint expression, and also for affnity/anti-affinity (Per suggested
by Kostas). This should have minimum impact to existing features. But after this, the "nodeLabelExpression
becomes a little ambiguous, we may need to deprecate existing nodeLabelExpression.

*My preference:*
*Personally I prefer b.* but it's better to rename nodeConstraintExpression to placementStrategy
so we can have consistent naming and semantics after YARN-4902. Actually in my POC patch of
YARN-1042 (https://issues.apache.org/jira/secure/attachment/12822186/YARN-1042-global-scheduling.poc.1.patch#file-0),
I use the {{pacementStrategy}} as the name

h3. 2. For CLI / REST API to manage (add/remove) and get constraint, we also have a couple
of choices:

*Choice a.*
Add a set of new APIs / CLI to manage node constraints, for example, we can have a REST API
{{POST /node-constraints/add}} to add node constraints

*Choice b.*
Extend existing {{NodeLabel}} object to support node constraint, we only need two additional
field to support node constraint. 1) isNodeConstraint 2) Value (For example, we can have a
constraint named jdk-verion, and value could be 6/7/8).

*My perference:*
Personally I prefer b. Since we can reuse most of existing CLI / REST API implementations.

I found there're some other discussions need to be settled. Such as, support rack constraint,
should we have node constraint added to non-ANY request, etc. I suggest to discuss them after
we get consensus about above two points.

Please share your thoughts (Clearly stating which one you most prefer with some explanations
will be better). 

+ [~naganarasimha_gr@apache.org], [~kkaranasos], [~devaraj.k], [~varun_saxena].

> Add constraint node labels
> --------------------------
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf
> Specify only one label for each node (IAW, partition a cluster) is a way to determinate
how resources of a special set of nodes could be shared by a group of entities (like teams,
departments, etc.). Partitions of a cluster has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has priority to
use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% minimum capacity
and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of node’s
hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 2.20
&& JDK.version >= 8u20 && x86_64).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message