hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
Date Thu, 09 Apr 2015 19:25:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488061#comment-14488061

Thomas Graves commented on YARN-3434:

And I've a question about continous reservation checking behavior, may or may not related
to this issue: Now it will try to unreserve all containers under a user, but actually it will
only unreserve at most one container to allocate a new container. Do you think is it fine
to change the logic to be:
When (continousReservation-enabled) && (user.usage + required - min(max-allocation,
user.total-reserved) <=user.limit), assignContainers will continue. This will prevent doing
impossible allocation when user reserved lots of containers. (As same as queue reservation

I do think the reservation checking and unreserving can be improved.  I basically started
with very simple thing and figured we could improve.  I'm not sure how much that check would
help in practice.  I guess it might help the cases where you have 1 user in the queue and
a second one shows up and your user limit gets decreased by a lot.  In that case it may prevent
it from continuing when it can short circuit here.  So it would seem to be ok for that.  

> Interaction between reservations and userlimit can result in significant ULF violation
> --------------------------------------------------------------------------------------
>                 Key: YARN-3434
>                 URL: https://issues.apache.org/jira/browse/YARN-3434
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>         Attachments: YARN-3434.patch
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 containers, each
8G each, within about 5 seconds. I think this allowed the logic in assignToUser() to allow
the userlimit to be surpassed.

This message was sent by Atlassian JIRA

View raw message