hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler
Date Mon, 23 Jan 2017 10:32:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834233#comment-15834233

Sunil G commented on YARN-5889:

Hi [~eepayne]
Thank you for the detailed comments.

bq.do we need the isAnActiveUser checks in assignContainer and releaseContainer?
bq.I removed these checks in my local build and the application is able to use all of the
queue and cluster.
If we remove the active user check, then {{activeUsersManager.getTotalResUsedByActiveUsers}}
will be for all users. And hence it works like old. But I agree that the computation is not
very correct. For example, *user1* was initially active and whenever a container was allocated
for *user1*, we incremented resource to  {{AUM#TotalResUsedByActiveUsers}}. Now this user
has become in-active since it doesnot have any more outstanding resource requests. So *user1*
resources has to be removed from  {{AUM#TotalResUsedByActiveUsers}} at that time. This is
not happening now. Eventhough I fix this, there are some changes in behavior. I can explain.

    // User limit resource is determined by:
    // max{resourceUsedForActiveUsers / #activeUsers, queueCapacity *
    // user-limit-percentage%)

Now here, lets assume 2 cases: ( 1. usedResource < queuCap and 2. usedResource > queueCap)

1. {{resourceUsedForActiveUsers / #activeUsers}} will be much lesser value now as we consider
only active-users used cap. In old case, {{total_used/#activeUsers}} will be definitely more.
So as per above equation, UL will be {{queueCapacity * userLimit%}} for higher MULP (something
like 80~99%). Hence UL will be less than queueCapacity. (If MULP is lesser value, then UL
will also be lower)
2. If {{usedResource > queueCap}}, then the UL can go more than queue cap based on two
factors. If #active_users is lesser and active_users resource usage is more than queue cap
OR usedResource which is more than queuCap is multiplied with a higher MULP value.

Altogether, first part of the existing UL compute equation will matter only if #active-users
is lesser or MULP is very low in cluster. I think its somewhat fine. Please share your thoughts.

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.0001.patch, YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch,
YARN-5889.0003.patch, YARN-5889.0004.patch, YARN-5889.0005.patch, YARN-5889.v0.patch, YARN-5889.v1.patch,
> Currently user-limit is computed during every heartbeat allocation cycle with a write
lock. To improve performance, this tickets is focussing on moving user-limit calculation out
of heartbeat allocation flow.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message