hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps
Date Wed, 20 Jan 2016 08:16:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108196#comment-15108196
] 

Wangda Tan edited comment on YARN-4606 at 1/20/16 8:16 AM:
-----------------------------------------------------------

Proposed solution: 
We should only consider a user is "active" if any of its application is active. And CS will
use the "#active-user-which-has-at-least-one-active-app" to compute user-limit.

Computation of max-am-resource-per-user needs to be updated as well. We should get a #users-which-has-pending-apps
to compute max-am-resource-per-user.

This looks like a major behavior change to existing scheduler logic. Thoughts? [~vinodkv]/[~jlowe]/[~jianhe].

I'm not sure if FairScheduler needs similar changes as well, if a user in FSLeafQueue doesn't
have any runnable apps, should we increase #active-users of QueueMetrics? [~kasha]


was (Author: leftnoteasy):
Proposed solution: 
We should only consider a user is "active" if any of its application is active. And CS will
use the "#active-user-which-has-at-least-one-active-app" to compute user-limit.

Computation of max-am-resource-per-user needs to be updated as well. We should get a #users-which-has-pending-apps
to compute max-am-resource-per-user.

This looks like a major behavior change to existing scheduler logic. Thoughts? [~vinodkv]/[~jlowe]/[~jianhe].

I'm not sure if FairScheduler needs similar changes as well, if a user in FSLeafQueue doesn't
have any runnable apps, should we increase #active-users of QueueMetrics?

> CapacityScheduler: applications could get starved because computation of #activeUsers
considers pending apps 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Wangda Tan
>            Priority: Critical
>
> Currently, if all applications belong to same user in LeafQueue are pending (caused by
max-am-percent, etc.), ActiveUsersManager still considers the user is an active user. This
could lead to starvation of active applications, for example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to user3)/app4(belongs
to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new resources.
So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message