hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler
Date Tue, 24 May 2016 22:45:14 GMT

    [ https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299084#comment-15299084
] 

Arun Suresh commented on YARN-5139:
-----------------------------------

Thanks for starting this [~leftnoteasy] !! would love to help out / contribute..

bq. Because only application knows which node is best for its pending resource requests, so
we can sort and filter node candidates based on application's resource-requests.
I agree.. generally.. but I am thinking that *relational* constraints (constraints that govern
placement / grouping of containers belonging to multiple apps. for. eg : containers of HBase
and Storm app to be grouped together. *together* can be a relaxed constraint like : within
same node, rack or ANY) should maybe be expressed via an API decoupled from the applications
ResourceRequest. It will also help solve problems where two apps give contradicting constraints..
for eg. RR from HBase app allocate() call says "allocate container with affinity to containers
from Storm App" and RR from Storm app says "allocate container with ANTI-affinity to Hbase
containers".

But in any case, even if what I stated above were a separate API, I still like your concept
of NodeCandidates. Since it essentially is a filter, and we should be able to compose the
ResourceRequests constraints with relational constraints. Also, as [~kasha] mentioned, it
would be nice to have the {{ClusterNodeTracker}} expose an API that takes a {{NodeCandidate}}
and returns a list of nodes.

> [Umbrella] Move YARN scheduler towards global scheduler
> -------------------------------------------------------
>
>                 Key: YARN-5139
>                 URL: https://issues.apache.org/jira/browse/YARN-5139
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: wip-1.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to sub-optimal decisions
because scheduler can only look at one node at the time when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>    Go to parentQueue
>       Go to leafQueue
>         for application in leafQueue.applications:
>            for resource-request in application.resource-requests
>               try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node constraints
(give me "a && b || c") or anti-affinity (do not allocate HBase regionsevers and Storm
workers on the same host), we may need to consider moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message