hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4557) Few issues in scheduling with Node Labels
Date Mon, 18 Jan 2016 04:16:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104143#comment-15104143

Wangda Tan commented on YARN-4557:

Hi [~Naganarasimha],

Thanks for comments, apologize for the delays,

bq. now may be after 10 NonExclusive nodes HB if container gets assigned for priority 10 then
mNPRSO for req with Priority 20 starts from where it had left off i.e. 6 , should it not be
from 0 ?
It's a valid concern, but I think it's a corner case:
- It's only valid when resources of different priorities are same.
- Example in your comment (requesting higher priority when it has some pending lower priority
container) is not as frequency as normal container request.
- The worst case is waiting for a node locality delay, not very bad.

I can understand there're some issues in our existing approach to handle locality delay with
priority, this is why I filed YARN-4189. I would not prefer to add additional complexity/behavior
change to existing delay scheduling mechanism unless it's critical (e.g. YARN-4287).

bq. RegularContainerAllocator.assignContainersOnNode(...) returns PRIORITY_SKIPPED hence is
there a chance for priority inversion ?
To me, if a request cannot be satisfied because of hard restrictions (e.g. partition/hard-locality),
we should give chance to lower priorities *in existing delay scheduling implementation*. 
You can take a look at YARN-4189 design doc, I have listed existing issues that delay scheduling
could cause priority inversion. I think these issues cannot be resolved in a easy way.

> Few issues in scheduling with Node Labels
> -----------------------------------------
>                 Key: YARN-4557
>                 URL: https://issues.apache.org/jira/browse/YARN-4557
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Minor
>         Attachments: YARN-4557.v1.001.patch, YARN-4557.v2.001.patch, YARN-4557.v2.002.patch
> * When app has submitted requests for multiple priority in default partition, then if
one of the priority requests has missed  non-partitioned-resource-request equivalent to cluster
size then container needs to be allocated. Currently if the higher priority requests doesn't
satisfy the condition, then whole application is getting skipped instead the priority
> * When queue has * as accessibility, then the queue ordering was not happening properly.

This message was sent by Atlassian JIRA

View raw message