hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Jie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit
Date Fri, 15 Apr 2016 02:16:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242298#comment-15242298

Tao Jie commented on YARN-3126:

I think this issue is quite common, and we have met the same problem.
The root cause is that when we should make the max-limitation check in assignment, we should
compare *current usage* + *resource to assign* with *max resource limit*. However when have
resource to assign to a queue, we know only *current resource usage* and *max resource limit*,
we don't know *resource to assign* until we assign resource to an appAttempt.
This patch seems add a additional check(checkQueueResourceLimit) on *leaf queue* then assign
to AppAttempt, but *parent queue* resource usage may still over max resource limit.
Also we already have *FSQueue.assignContainerPreCheck* for max resource limit. If we add a
new check, the former one seems to be unnecessary here.
[~kasha], would like to hear your thoughts.

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -----------------------------------------------------------------------------
>                 Key: YARN-3126
>                 URL: https://issues.apache.org/jira/browse/YARN-3126
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.3.0
>         Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>            Reporter: Xia Hu
>              Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
>             Fix For: trunk-win
>         Attachments: resourcelimit-02.patch, resourcelimit-test.patch, resourcelimit.patch
> When submitting spark application(both spark-on-yarn-cluster and spark-on-yarn-cleint
model), the queue's usedResources assigned by fairscheduler always can be more than the queue's
maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because of ignore
to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is bigger than
its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check whether this
container would make the queue sources over its max limit. If a queue's usedResource is 13G,
the maxResource limit is 16G, then a container which asking for 4G resources may be assigned
> This problem will always happen in spark application, cause we can ask for different
container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 

This message was sent by Atlassian JIRA

View raw message