hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manikandan R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps
Date Sat, 03 Mar 2018 12:22:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384645#comment-16384645
] 

Manikandan R commented on YARN-4606:
------------------------------------

[~eepayne] [~sunilg] Thanks for your inputs. Sorry for the delay.

Attached POC patch to confirm it is in line with our discussions. Please review the approach.
Will need to make it as robust patch by adding tests etc and also have to cover FS, FIFO as
well after the feedback.

Approach:

1. Introduce activeUsersOfPendingApps in users manager and increment this count as and when
apps are accepted.
 2. After activating the application, increment activeUsers and decrement activeUsersOfPendingApps
in {{UsersManager#activateApplication}} from {{AppSchedulingInfo#updatePendingResources}}
only when app is no more waiting for AM container.
 3. To calculate max AM limit per user, use activeUsers + activeUsersOfPendingApps.

> CapacityScheduler: applications could get starved because computation of #activeUsers
considers pending apps 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Wangda Tan
>            Priority: Critical
>         Attachments: YARN-4606.1.poc.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending (caused by
max-am-percent, etc.), ActiveUsersManager still considers the user is an active user. This
could lead to starvation of active applications, for example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to user3)/app4(belongs
to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new resources.
So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message