hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned
Date Mon, 07 Dec 2015 17:56:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045338#comment-15045338
] 

Naganarasimha G R commented on YARN-4415:
-----------------------------------------

Hi [~xinxianyin]
Thanks for the feedback, but your points partially matches to the description what have given
in description
bq. This would cause confusion because the access-labels inherited from parent have 0 max
capacities. If the case is true, i agree that the inherited access-labels has 100 max capacities
by default.
I am lil confused with your decription here, but what i am trying to specify is: max capacity
of a accessible node label(xxx is accessible to queue as * is configured) for a queue should
be 100 and not 0 which is not currently happening as the max capacity is not configured for
the current queue nor its parent.

bq. But for the two scenarios in the description, i feel the final result is reasonable because
you didnt set the access-labels for the queue and its parent doesn't have the access-labels
also, so the label is not accessable explicitly by the queue. 
i want to correct here, what i have not set is *capacities* but accessible node labels for
the queue has been set as {{*}}. so the label is accessible but the practically resources
are configured to zero by default. If label is not accessible then it would have thrown exception
while submitting the application but it dint. 

bq. But the info that the web ui shows is wrong if the above analysis is right. i think the
cause is from follow sentence in {{QueueCapacitiesInfo.java}}
Its not because of this change but its caused due in {{CapacitySchedulerPage.LeafQueueInfoBlock.renderQueueCapacityInfo}}
when we try to fetch {{lqinfo.getCapacities().getPartitionQueueCapacitiesInfo(label)}} we
fetch PartitionQueueCapacitiesInfo with default values which sets default max capacities as
100.

IMO we need to fetch the capacities of a partition for a given queue from its parent, if capacities
are not configured for it. And if its not configured to its parent then from its parent's
capacities. if the root itself doesnt have then its should be 0 as capacity and 100 as max
capacity

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application
doesnt get assigned
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4415
>                 URL: https://issues.apache.org/jira/browse/YARN-4415
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: App info with diagnostics info.png, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that queue for
*sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity is set to
Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message