hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4557) Few issues in scheduling with Node Labels
Date Wed, 13 Jan 2016 08:22:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095809#comment-15095809

Naganarasimha G R commented on YARN-4557:

Hi [~wangda],
Thanks for patiently answering my queries, I still have few doubts : 
bq. This is as same as we store missed opportunity of delayed scheduling. Different priority
could have different requests.
I am little confused here if different priority have different requests why to treat them
differently when assigning Ignore partition mode. consider an example :
In a cluster of *size 10*,
Assume app has  initially requested for *[Priority 20, #containers 1,  mem 8gb, label = default
,mNPRSO = 6]*
??*mNPRSO => missedNonPartitionedRequestSchedulingOpportunity*??
now it additionally requests for  *[Priority 10, #containers 1,  mem 8gb, label =  default
,mNPRSO = 0]*
now may be after 10 NonExclusive nodes HB if container gets assigned for priority 10 then
mNPRSO for req with Priority 20 starts from where it had left off i.e. 6 , *should it not
be from 0* ?

consider the reverse case where app initially requests *[Priority 10, #containers 1,  mem
8gb, label = default ,mNPRSO = 5]*
additionally requests *[Priority 20, #containers 1,  mem 8gb, label = default ,mNPRSO = 0]*
then if priority 10 is assigned after 5 more NonExclusive nodes HB only then mNPRSO for *priority
20* is started. 
So felt this is not correct and better to consider {{missedNonPartitionedRequestSchedulingOpportunity}}
for app as whole or consider it individually for each priority and return AllocationState.APP_SKIPPED

bq. This cannot happen, see following code it: if (allocation.state == AllocationState.LOCALITY_SKIPPED)
Thanks had missed observing this part of the code, but consider when {{ResourceRequest.getRelaxLocality}}
is false then 
{{RegularContainerAllocator.assignContainersOnNode(...)}} returns {{PRIORITY_SKIPPED}} hence
is there a chance for priority inversion ?

bq. This is "cannot use" and non-exclusive delay is "cannot be satisfied currently"
IIUC you are indicating that RR's with diff priorities but for the same partition then priority
inversion should not happen ?

> Few issues in scheduling with Node Labels
> -----------------------------------------
>                 Key: YARN-4557
>                 URL: https://issues.apache.org/jira/browse/YARN-4557
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Minor
>         Attachments: YARN-4557.v1.001.patch, YARN-4557.v2.001.patch, YARN-4557.v2.002.patch
> * When app has submitted requests for multiple priority in default partition, then if
one of the priority requests has missed  non-partitioned-resource-request equivalent to cluster
size then container needs to be allocated. Currently if the higher priority requests doesn't
satisfy the condition, then whole application is getting skipped instead the priority
> * When queue has * as accessibility, then the queue ordering was not happening properly.

This message was sent by Atlassian JIRA

View raw message