hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Jie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6101) Delay scheduling for node resource balance
Date Wed, 22 Feb 2017 09:20:44 GMT

    [ https://issues.apache.org/jira/browse/YARN-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877833#comment-15877833

Tao Jie commented on YARN-6101:

[~He Tianyi], thank you for sharing your case.
Today scheduling is triggered by NM heartbeat, that is once one NM come, the scheduler select
 containers to assign to this NM. It is difficult to find the global best node to run container
for applications. It seems that YARN-5139 improves the scheduling logic, which is first we
find a set of candidate nodes for each resource request, then we have NodeScorer to measure
which node is the best to allocate. In this case, node's utilization should be considered.

> Delay scheduling for node resource balance
> ------------------------------------------
>                 Key: YARN-6101
>                 URL: https://issues.apache.org/jira/browse/YARN-6101
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>            Reporter: He Tianyi
>            Priority: Minor
>         Attachments: YARN-6101.preliminary.0000.patch
> We observed that, in today's cluster, usage of Spark has dramatically increased. 
> This introduced a new issue that CPU/MEM utilization for single node may become imbalanced
due to Spark is generally more memory intensive. For example, after a node with capability
(48 cores, 192 GB memory) cannot satisfy a (1 core, 2 GB memory) request if current used resource
is (20 cores, 191 GB memory), with plenty of total available resource across the whole cluster.
> A thought for avoiding the situation is to introduce some strategy during scheduling.
> This JIRA proposes a delay-scheduling-alike approach to achieve better balance between
different type of resources on each node.
> The basic idea is consider dominant resource for each node, and when a scheduling opportunity
on a particular node is offered to a resource request, better make sure the allocation is
changing dominant resource of the node, or, in worst case, allocate at once when number of
offered scheduling opportunities exceeds a certain number.
> With YARN SLS and a simulation file with hybrid workload (MR+Spark), the approach improved
cluster resource usage by nearly 5%. And after deployed to production, we observed a 8% improvement.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message