hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From w...@apache.org
Subject hadoop git commit: YARN-8394. Improve data locality documentation for Capacity Scheduler. Contributed by Weiwei Yang.
Date Wed, 13 Jun 2018 05:56:33 GMT
Repository: hadoop
Updated Branches:
  refs/heads/branch-3.1 f516a7a85 -> 4488ad529


YARN-8394. Improve data locality documentation for Capacity Scheduler. Contributed by Weiwei
Yang.


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/4488ad52
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/4488ad52
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/4488ad52

Branch: refs/heads/branch-3.1
Commit: 4488ad5297ca98b3bbaebb1180233208af2bce8b
Parents: f516a7a
Author: Weiwei Yang <wwei@apache.org>
Authored: Wed Jun 13 09:28:05 2018 +0800
Committer: Weiwei Yang <wwei@apache.org>
Committed: Wed Jun 13 13:54:09 2018 +0800

----------------------------------------------------------------------
 .../conf/capacity-scheduler.xml                                 | 2 ++
 .../hadoop-yarn-site/src/site/markdown/CapacityScheduler.md     | 5 +++++
 2 files changed, 7 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/4488ad52/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
index aca6c7c..62654ca 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
@@ -149,6 +149,8 @@
       attempts to schedule rack-local containers.
       When setting this parameter, the size of the cluster should be taken into account.
       We use 40 as the default value, which is approximately the number of nodes in one rack.
+      Note, if this value is -1, the locality constraint in the container request
+      will be ignored, which disables the delay scheduling.
     </description>
   </property>
 

http://git-wip-us.apache.org/repos/asf/hadoop/blob/4488ad52/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
index f578ca7..9857010 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
@@ -389,9 +389,14 @@ list of current scheduling edit policies as a comma separated string
in `yarn.re
 
   * Data Locality
 
+Capacity Scheduler leverages `Delay Scheduling` to honor task locality constraints. There
are 3 levels of locality constraint: node-local, rack-local and off-switch. The scheduler
counts the number of missed opportunities when the locality cannot be satisfied, and waits
this count to reach a threshold before relaxing the locality constraint to next level. The
threshold can be configured in following properties:
+
 | Property | Description |
 |:---- |:---- |
 | `yarn.scheduler.capacity.node-locality-delay` | Number of missed scheduling opportunities
after which the CapacityScheduler attempts to schedule rack-local containers. Typically, this
should be set to number of nodes in the cluster. By default is setting approximately number
of nodes in one rack which is 40. Positive integer value is expected. |
+| `yarn.scheduler.capacity.rack-locality-additional-delay` |  Number of additional missed
scheduling opportunities over the node-locality-delay ones, after which the CapacityScheduler
attempts to schedule off-switch containers. By default this value is set to -1, in this case,
the number of missed opportunities for assigning off-switch containers is calculated based
on the formula `L * C / N`, where `L` is number of locations (nodes or racks) specified in
the resource request, `C` is the number of requested containers, and `N` is the size of the
cluster. |
+
+Note, this feature should be disabled if YARN is deployed separately with the file system,
as locality is meaningless. This can be done by setting `yarn.scheduler.capacity.node-locality-delay`
to `-1`, in this case, request's locality constraint is ignored.
 
   * Container Allocation per NodeManager Heartbeat
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


Mime
View raw message