hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler
Date Thu, 01 Dec 2016 10:56:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711647#comment-15711647
] 

Sunil G commented on YARN-5889:
-------------------------------

Thanks [~eepayne]

Yes. for scheduler, 1ms is also smaller. It was a tradeoff to see the performance gain and
its impact. With SLS test, i could be see good improvement   in allocation speed.

Now to bridge the gap, there are 2 cases
- How to make sure that every allocation gets correct and accurate user-limit value given
computation happens at 1ms?
- In a lousy cluster, how can we save CPU cycles to prevent too much of unnecessary computations?

Yes, an ideal way is as suggested by you.
- Any change in resource (allocation and release of container  etc) for a given user could
set a state variable. This will set off by the computation thread if next cycle falls immediate.
- Its not ideal to ask allocation thread to hold till computation. So by seeing this state
variable, we might need to compute user-limit in same allocation thread. 

I was looking in second step to see how much impact it can cause if user-limit is slightly
older. We may over allocate or we may under allocate. I think under-allocate scenario is fine
as we will allocate more from next milli second. However overallocate scenario may be a worry.
Still we have preemptions/opportunistic ways to handle this.

Ideally we were looking to avoid user-limit computation from same allocation thread. So after
step 1), we can force the user-allocate thread to push for an immediate computation. Still
there could some exceptionally rare case where user-limit thread is doing computation as per
release/allocate demand. But another allocation thread (heartbeat) may also go in same time
frame. If this is fine, I could update my patch to handle this case.

Thoughts?

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with a write
lock. To improve performance, this tickets is focussing on moving user-limit calculation out
of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message