hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3409) Add constraint node labels
Date Tue, 23 May 2017 21:57:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16021942#comment-16021942
] 

Daniel Templeton commented on YARN-3409:
----------------------------------------

Sorry for coming late to the conversation.  Last week I had a quick chat with [~Naganarasimha]
offline about the plans, and I wanted to share an alternate perspective.

If you go look at the way HPC job schedulers (like Grid Engine et al) handle this requirement,
it's an extension of resources.  The work that [~vvasudev] has done on resource types opens
up a natural path to add "static" resource types with the characteristics described here.
 The advantage is that the plumbing for resources is already very mature, and extending it
to support static resources would not introduce much in the way of new logic.  The implementation
of constraints then naturally becomes a superset of resource matching for the consumable resources.
 The disadvantage that [~Naganarasimha] pointed out is that users would have to understand
that resources can be static or consumable, which is a higher bar than just asserting that
all resources are consumable. Given that all the major HPC job schedulers have been using
static resources for this purpose successfully for decades, I don't see that being a major
issue.

To add a little more detail, here's the what Grid Engine does (that's relevant to us).  (See
http://gridscheduler.sourceforge.net/htmlman/htmlman5/complex.html)
* All resources have a type, e.g. string, double, boolean, etc.
* All resources have an associated relational operator.  For example the memory resource has
>= as a relational operator, meaning that a request for 4GB of memory is treated as >=
4GB of memory.  In general, resources can only be meaningfully compared one direction.
* All resources are either consumable or static.  Only numeric resources can be consumable.
* Memory and CPU (and a couple others) are provided implicitly by the system.
* It's possible to configure the agents to run scripts periodically to programmatically determine
values for any resources. Consumable resources decrement from that value.
* The scheduler uses the relational operator for all resources to determine whether resource
requests fit a destination queue/host.

Putting static resources and consumables in the same boat saves a fair bit of logic duplication
in implementing things like programmatically determined values.

> Add constraint node labels
> --------------------------
>
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to determinate
how resources of a special set of nodes could be shared by a group of entities (like teams,
departments, etc.). Partitions of a cluster has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has priority to
use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% minimum capacity
and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of node’s
hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 2.20
&& JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message