hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7408) total capacity could be occupied by a large container request
Date Fri, 27 Oct 2017 14:42:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222472#comment-16222472
] 

Jason Lowe commented on YARN-7408:
----------------------------------

I assume you are using CapacityScheduler?  The answers may change for FairScheduler as I am
not as familiar with how it handles reservations.

Reservations can increase over time as the allocation request remains unsatisfied, but the
amount of space that can be reserved is limited by the user and queue limits.  In other words,
the user can't reserve the whole cluster unless they are allowed to use the whole cluster
normally.

As for other container requests, it depends upon where these requests are coming from.  If
these are other requests from the same application then the app needs to change the priority
of the other requests.  The RM allocates containers in priority order, so it won't consider
the other requests until the reservations are satisifed or the request is cancelled.  If the
requests are coming from other apps then it could be the priority of the app relative to the
other apps.  Apps ahead in the queue will get first crack at resources or we risk indefinite
postponement.  Proposals to artificially limit reservations for an app also risk this same
indefinite postponement if the scheduler happened to choose poorly where to place the limited
number of reservations.  In a cluster with long running containers, this app may not ever
run in a timely manner.

One way to achieve something close to what you are proposing is to have the problematic app
run in a separate queue where you can explicitly cap the resources associated with that app,
reserved or otherwise.  The app will only be able to reserve up to the queue's capacity at
most.  This should work quite well, assuming the total resources required by the app is less
than you are willing to allow it to reserve in its attempt to get containers.


> total capacity could be occupied by a large container request
> -------------------------------------------------------------
>
>                 Key: YARN-7408
>                 URL: https://issues.apache.org/jira/browse/YARN-7408
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: kyungwan nam
>
> if NM can not afford to allocate a large container request, it will be reserved container.
> but, in a cluster with long running apps, it is not often that running containers are
released.
> in cases like this, reserved containers will be increased as time goes on. as a result,
total capacity could be occupied by reserved resources.
> it makes other container requests starve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message