hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5864) YARN Capacity Scheduler - Queue Priorities
Date Fri, 27 Jan 2017 13:33:24 GMT

    [ https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15842875#comment-15842875
] 

Hudson commented on YARN-5864:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11184 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11184/])
YARN-6123. [YARN-5864] Add a test to make sure queues of orderingPolicy (sunilg: rev 165f07f51a03137d2e73e39ed1cb48385d963f39)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/policy/PriorityUtilizationQueueOrderingPolicy.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerSurgicalPreemption.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java


> YARN Capacity Scheduler - Queue Priorities
> ------------------------------------------
>
>                 Key: YARN-5864
>                 URL: https://issues.apache.org/jira/browse/YARN-5864
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>             Fix For: 2.9.0, 3.0.0-alpha3
>
>         Attachments: YARN-5864.001.patch, YARN-5864.002.patch, YARN-5864.003.patch, YARN-5864.004.patch,
YARN-5864.005.patch, YARN-5864.006.patch, YARN-5864.007.patch, YARN-5864.branch-2.007_2.patch,
YARN-5864.branch-2.007.patch, YARN-5864.branch-2.008.patch, YARN-5864.poc-0.patch, YARN-5864-preemption-performance-report.pdf,
YARN-5864-usage-doc.html, YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative used-capacities
of the chil-queues to decide which queue can get next available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is selected
by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes a few troubles
because scarce resources could be assigned to less-important apps first.
> # Latency sensitivity: This can be a problem with latency sensitive applications where
waiting till the ‘other’ queue gets full is not going to cut it. The delay in scheduling
directly reflects in the response times of these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also causes issues
with applications that need very large containers. It is possible that existing queues are
all within their resource guarantees but their current allocation distribution on each node
may be such that an application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short running
apps, previous containers may eventually finish and make enough space for the apps with large
containers. But with long running services in the cluster, the large containers’ application
may never get resources on any nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal batch apps.
For example, for a long running service in a separate queue (say queue=service), during peak
hours it may want to launch instances on 50% of the cluster nodes. On each node, it may want
to launch a large container, say 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message