hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antal Bálint Steinbach (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-8468) Limit container sizes per queue in FairScheduler
Date Mon, 23 Jul 2018 13:48:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552885#comment-16552885
] 

Antal Bálint Steinbach edited comment on YARN-8468 at 7/23/18 1:47 PM:
-----------------------------------------------------------------------

Hi [~haibochen] !

I only commented "Thanks for the feedback [~wilfreds]", but I also fixed his suggestions.
I am sorry for that, please find my responses inline.
 - a {{FSLeafQueue}} and {{FSParentQueue}} always have a parent doing a null check on
the parent is unneeded. The only queue that does not have a parent is the root queue which
you already have special cased. _(In some tests sub-queue does not have a parent)_
 - {{getMaximumResourceCapability}} must support resource types and not just memory and vcores,
same as YARN-7556 for this setting (_It returns Resource I assume than it is ok with resource
types_)
 - {{getMaxAllowedAllocation}} from the NodeTracker support more than just memory and vcores,
needs to flow through (_It returns Resource I assume than it is ok with resource types_)
 - {{FairScheduler}}: Why change the static imports only for a part of the config values,
either change all or none (none is preferred) (_Fixed_)
 - {{FairSchedulerPage}}: missing toString on the ResourceInfo (_added but I can't see why
is it necessary_)
 - Testing must also use resource types not only the old configuration type like: "memory-mb=5120,
test1=4, vcores=2" _(Test added)_
 - {{TestFairScheduler}} Testing must also include failure cases for sub queues not just
the root queue: setting value on root queue should throw and should not be applied (_Fixed_)
 - If this TestQueueMaxContainerAllocationValidator is a new file make sure that you add the
license etc (_license text added for the new files_)

Balint


was (Author: bsteinbach):
Hi [~haibochen] !

I only commented "Thanks for the feedback [~wilfreds]", but I also fixed his suggestions.
I am sorry for that, please find my responses inline.
 - a {{FSLeafQueue}} and {{FSParentQueue}} always have a parent doing a null check on
the parent is unneeded. The only queue that does not have a parent is the root queue which
you already have special cased. _(In some tests sub-queue does not have a parent)_
 - {{getMaximumResourceCapability}} must support resource types and not just memory and vcores,
same as YARN-7556 for this setting (_It supports Resource I assume than it is ok with resource
types_)
 - {{getMaxAllowedAllocation}} from the NodeTracker support more than just memory and vcores,
needs to flow through (_It supports Resource I assume than it is ok with resource types_)
 - {{FairScheduler}}: Why change the static imports only for a part of the config values,
either change all or none (none is preferred) (_Fixed_)
 - {{FairSchedulerPage}}: missing toString on the ResourceInfo (_added but I can't see why
is it necessary_)
 - Testing must also use resource types not only the old configuration type like: "memory-mb=5120,
test1=4, vcores=2" _(Test added)_
 - {{TestFairScheduler}} Testing must also include failure cases for sub queues not just
the root queue: setting value on root queue should throw and should not be applied (_Fixed_)
 - If this TestQueueMaxContainerAllocationValidator is a new file make sure that you add the
license etc (_license text added for the new files_)

Balint

> Limit container sizes per queue in FairScheduler
> ------------------------------------------------
>
>                 Key: YARN-8468
>                 URL: https://issues.apache.org/jira/browse/YARN-8468
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 3.1.0
>            Reporter: Antal Bálint Steinbach
>            Assignee: Antal Bálint Steinbach
>            Priority: Critical
>              Labels: patch
>         Attachments: YARN-8468.000.patch, YARN-8468.001.patch, YARN-8468.002.patch, YARN-8468.003.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit
the overall size of a container. This applies globally to all containers and cannot be limited
by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User
wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many
resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for
maximum container size for all queues and setting maximum resources per queue with “maxContainerResources”
queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and leaf), this
will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should not allow
that.
>  * make sure that queue resource cap can not be larger than scheduler max resource cap
in the config.
>  * implement getMaximumResourceCapability(String queueName) in the FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as
follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message