hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kkarana...@apache.org
Subject [2/2] hadoop git commit: YARN-8113. Update placement constraints doc with application namespaces and inter-app constraints. Contributed by Weiwei Yang.
Date Wed, 02 May 2018 18:52:02 GMT
YARN-8113. Update placement constraints doc with application namespaces and inter-app constraints.
Contributed by Weiwei Yang.

(cherry picked from commit 3b34fca4b5d67a2685852f30bb61e7c408a0e886)


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/62ad9d51
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/62ad9d51
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/62ad9d51

Branch: refs/heads/branch-3.1
Commit: 62ad9d512d70247c11b0db62d9385eb8444cad15
Parents: 6fce887
Author: Konstantinos Karanasos <kkaranasos@apache.org>
Authored: Wed May 2 11:48:35 2018 -0700
Committer: Konstantinos Karanasos <kkaranasos@apache.org>
Committed: Wed May 2 11:51:45 2018 -0700

----------------------------------------------------------------------
 .../site/markdown/PlacementConstraints.md.vm    | 67 +++++++++++++++-----
 1 file changed, 52 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/62ad9d51/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
index cb34c3f..4ac1683 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
@@ -28,7 +28,7 @@ YARN allows applications to specify placement constraints in the form of
data lo
 
 For example, it may be beneficial to co-locate the allocations of a job on the same rack
(*affinity* constraints) to reduce network costs, spread allocations across machines (*anti-affinity*
constraints) to minimize resource interference, or allow up to a specific number of allocations
in a node group (*cardinality* constraints) to strike a balance between the two. Placement
decisions also affect resilience. For example, allocations placed within the same cluster
upgrade domain would go offline simultaneously.
 
-The applications can specify constraints without requiring knowledge of the underlying topology
of the cluster (e.g., one does not need to specify the specific node or rack where their containers
should be placed with constraints) or the other applications deployed. Currently **intra-application**
constraints are supported, but the design that is followed is generic and support for constraints
across applications will soon be added. Moreover, all constraints at the moment are **hard**,
that is, if the constraints for a container cannot be satisfied due to the current cluster
condition or conflicting constraints, the container request will remain pending or get will
get rejected.
+The applications can specify constraints without requiring knowledge of the underlying topology
of the cluster (e.g., one does not need to specify the specific node or rack where their containers
should be placed with constraints) or the other applications deployed. Currently, all constraints
are **hard**, that is, if a constraint for a container cannot be satisfied due to the current
cluster condition or conflicting constraints, the container request will remain pending or
get rejected.
 
 Note that in this document we use the notion of “allocation” to refer to a unit of resources
(e.g., CPU and memory) that gets allocated in a node. In the current implementation of YARN,
an allocation corresponds to a single container. However, in case an application uses an allocation
to spawn more than one containers, an allocation could correspond to multiple containers.
 
@@ -65,15 +65,19 @@ $ yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar
share/ha
 where **PlacementSpec** is of the form:
 
 ```
-PlacementSpec => "" | KeyVal;PlacementSpec
-KeyVal        => SourceTag=Constraint
-SourceTag     => String
-Constraint    => NumContainers | NumContainers,"IN",Scope,TargetTag | NumContainers,"NOTIN",Scope,TargetTag
| NumContainers,"CARDINALITY",Scope,TargetTag,MinCard,MaxCard
-NumContainers => int
-Scope         => "NODE" | "RACK"
-TargetTag     => String
-MinCard       => int
-MaxCard       => int
+PlacementSpec         => "" | KeyVal;PlacementSpec
+KeyVal                => SourceTag=ConstraintExpr
+SourceTag             => String
+ConstraintExpr        => NumContainers | NumContainers, Constraint
+Constraint            => SingleConstraint | CompositeConstraint
+SingleConstraint      => "IN",Scope,TargetTag | "NOTIN",Scope,TargetTag | "CARDINALITY",Scope,TargetTag,MinCard,MaxCard
+CompositeConstraint   => AND(ConstraintList) | OR(ConstraintList)
+ConstraintList        => Constraint | Constraint:ConstraintList
+NumContainers         => int
+Scope                 => "NODE" | "RACK"
+TargetTag             => String
+MinCard               => int
+MaxCard               => int
 ```
 
 Note that when the `-placement_spec` argument is specified in the distributed shell command,
the `-num-containers` argument should not be used. In case `-num-containers` argument is used
in conjunction with `-placement-spec`, the former is ignored. This is because in PlacementSpec,
we determine the number of containers per tag, making the `-num-containers` redundant and
possibly conflicting. Moreover, if `-placement_spec` is used, all containers will be requested
with GUARANTEED execution type.
@@ -82,11 +86,18 @@ An example of PlacementSpec is the following:
 ```
 zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3
 ```
-The above encodes two constraints:
+The above encodes three constraints:
 * place 3 containers with tag "zk" (standing for ZooKeeper) with node anti-affinity to each
other, i.e., do not place more than one container per node (notice that in this first constraint,
the SourceTag and the TargetTag of the constraint coincide);
 * place 5 containers with tag "hbase" with affinity to a rack on which containers with tag
"zk" are running (i.e., an "hbase" container should not be placed at a rack where an "zk"
container is running, given that "zk" is the TargetTag of the second constraint);
-* place 7 container with tag "spark" in nodes that have at least one, but no more than three,
containers, with tag "hbase".
+* place 7 containers with tag "spark" in nodes that have at least one, but no more than three,
containers with tag "hbase".
 
+Another example below demonstrates a composite form of constraint:
+```
+zk=5,AND(IN,RACK,hbase:NOTIN,NODE,zk)
+```
+The above constraint uses the conjunction operator `AND` to combine two constraints. The
AND constraint is satisfied when both its children constraints are satisfied. The specific
PlacementSpec requests to place 5 "zk" containers in a rack where at least one "hbase" container
is running, and on a node that no "zk" container is running.
+Similarly, an `OR` operator can be used to define a constraint that is satisfied when at
least one of its children constraints is satisfied.
+Note that in case "zk" and "hbase" are containers belonging to different applications (which
is most probably the case in real use cases), the allocation tags in the PlacementSpec should
include namespaces, as we describe below (see [Allocation tags namespace](#Allocation_tags_namespace)).
 
 
 Defining Placement Constraints
@@ -98,11 +109,37 @@ Allocation tags are string tags that an application can associate with
(groups o
 
 Note that instead of using the `ResourceRequest` object to define allocation tags, we use
the new `SchedulingRequest` object. This has many similarities with the `ResourceRequest`,
but better separates the sizing of the requested allocations (number and size of allocations,
priority, execution type, etc.), and the constraints dictating how these allocations should
be placed (resource name, relaxed locality). Applications can still use `ResourceRequest`
objects, but in order to define allocation tags and constraints, they need to use the `SchedulingRequest`
object. Within a single `AllocateRequest`, an application should use either the `ResourceRequest`
or the `SchedulingRequest` objects, but not both of them.
 
+$H4 Allocation tags namespace
+
+Allocation tags might refer to containers of the same or different applications, and are
used to express intra- or inter-application constraints, respectively.
+We use allocation tag namespaces in order to specify the scope of applications that an allocation
tag can refer to. By coupling an allocation tag with a namespace, we can restrict whether
the tag targets containers that belong to the same application, to a certain group of applications,
or to any application in the cluster.
+
+We currently support the following namespaces:
+
+| Namespace | Syntax | Description |
+|:--------- |:-------|:------------|
+| SELF | `self/${allocationTag}` | The allocation tag refers to containers of the current
application (to which the constraint will be applied). This is the default namespace. |
+| NOT_SELF | `not-self/${allocationTag}` | The allocation tag refers only to containers that
do not belong to the current application. |
+| ALL | `all/${allocationTag}` | The allocation tag refers to containers of any application.
|
+| APP_ID | `app-id/${applicationID}/${allocationTag}` | The allocation tag refers to containers
of the application with the specified application ID. |
+| APP_TAG | `app-tag/application_tag_name/${allocationTag}` | The allocation tag refers to
containers of applications that are tagged with the specified application tag. |
+
+
+To attach an allocation tag namespace `ns` to a target tag `targetTag`, we use the syntax
`ns/allocationTag` in the PlacementSpec. Note that the default namespace is `SELF`, which
is used for **intra-app** constraints. The remaining namespace tags are used to specify **inter-app**
constraints. When the namespace is not specified next to a tag, `SELF` is assumed.
+
+The example constraints used above could be extended with namespaces as follows:
+```
+zk=3,NOTIN,NODE,not-self/zk:hbase=5,IN,RACK,all/zk:spark=7,CARDINALITY,NODE,app-id/appID_0023/hbase,1,3
+```
+The semantics of these constraints are the following:
+* place 3 containers with tag "zk" (standing for ZooKeeper) to nodes that do not have "zk"
containers from other applications running;
+* place 5 containers with tag "hbase" with affinity to a rack on which containers with tag
"zk" (from any application, be it the same or a different one) are running;
+* place 7 containers with tag "spark" in nodes that have at least one, but no more than three,
containers with tag "hbase" belonging to application with ID `appID_0023`.
+
 $H4 Differences between node labels, node attributes and allocation tags
 
 The difference between allocation tags and node labels or node attributes (YARN-3409), is
that allocation tags are attached to allocations and not to nodes. When an allocation gets
allocated to a node by the scheduler, the set of tags of that allocation are automatically
added to the node for the duration of the allocation. Hence, a node inherits the tags of the
allocations that are currently allocated to the node. Likewise, a rack inherits the tags of
its nodes. Moreover, similar to node labels and unlike node attributes, allocation tags have
no value attached to them. As we show below, our constraints can refer to allocation tags,
as well as node labels and node attributes.
 
-
 $H3 Placement constraints API
 
 Applications can use the public API in the `PlacementConstraints` to construct placement
constraint. Before describing the methods for building constraints, we describe the methods
of the `PlacementTargets` class that are used to construct the target expressions that will
then be used in constraints:
@@ -110,7 +147,7 @@ Applications can use the public API in the `PlacementConstraints` to construct
p
 | Method | Description |
 |:------ |:----------- |
 | `allocationTag(String... allocationTags)` | Constructs a target expression on an allocation
tag. It is satisfied if there are allocations with one of the given tags. |
-| `allocationTagToIntraApp(String... allocationTags)` | similar to `allocationTag(String...)`,
but targeting only the containers of the application that will use this target (intra-application
constraints). |
+| `allocationTagWithNamespace(String namespace, String... allocationTags)` | Similar to `allocationTag(String...)`,
but allows to specify a namespace for the given allocation tags. |
 | `nodePartition(String... nodePartitions)` | Constructs a target expression on a node partition.
It is satisfied for nodes that belong to one of the `nodePartitions`. |
 | `nodeAttribute(String attributeKey, String... attributeValues)` | Constructs a target expression
on a node attribute. It is satisfied if the specified node attribute has one of the specified
values. |
 
@@ -136,4 +173,4 @@ Applications have to specify the containers for which each constraint
will be en
 
 When using the `placement-processor` handler (see [Enabling placement constraints](#Enabling_placement_constraints)),
this constraint mapping is specified within the `RegisterApplicationMasterRequest`.
 
-When using the `scheduler` handler, the constraints can also be added at each `SchedulingRequest`
object. Each such constraint is valid for the tag of that scheduling request. In case constraints
are specified both at the `RegisterApplicationMasterRequest` and the scheduling requests,
the latter override the former.
+When using the `scheduler` handler, the constraints can also be added at each `SchedulingRequest`
object. Each such constraint is valid for the tag of that scheduling request. In case constraints
are specified both at the `RegisterApplicationMasterRequest` and the scheduling requests,
the latter override the former.
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


Mime
View raw message