hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
Date Tue, 14 Apr 2015 23:44:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495162#comment-14495162
] 

Wangda Tan commented on YARN-3434:
----------------------------------

[~tgraves],
I understand the motivation of doing this, but actually ResourceLimits class in LeafQueue
is created for this purpose, ideally all limits info will be saved in the ResourceLimits.

I suggest to reuse ResourceLimits for this proposal:
- Add a field minimumResourceNeedUnreserve to ResourceLimits so that we don't need to compute
getMinimumResourceNeedUnreserved. The field will be updated when we compute & compared
user-limit
- needToUnreserve seems not necessary, check minimumResourceNeedUnreserve <= 0 should be
enough
- shouldContinue seems not necessary to me to, allocation will directly return when shouldContinue
== false

With this, we can avoid some method signature changes in LeafQueue, and we don't need to change
ParentQueue.

> Interaction between reservations and userlimit can result in significant ULF violation
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-3434
>                 URL: https://issues.apache.org/jira/browse/YARN-3434
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>         Attachments: YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 containers, each
8G each, within about 5 seconds. I think this allowed the logic in assignToUser() to allow
the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message