hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Condit (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-9569) Auto-created leaf queues do not honor cluster-wide min/max memory/vcores
Date Mon, 20 May 2019 20:08:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Craig Condit updated YARN-9569:
-------------------------------
    Description: 
Auto-created leaf queues do not honor cluster-wide settings for maximum CPU/vcores allocation.

To reproduce:
 # Set auto-create-child-queue.enabled=true for a parent queue.
 # Set leaf-queue-template.maximum-allocation-mb=16384.
 # Set yarn.resource-types.memory-mb.maximum-allocation=16384 in resource-types.xml
 # Launch a YARN app with a container requesting 16 GB RAM.

 

This scenario should work, but instead you get an error similar to this:
{code:java}
java.lang.IllegalArgumentException: Queue maximum allocation cannot be larger than the cluster
setting for queue root.auto.test max allocation per queue: <memory:16384, vCores:1>
cluster setting: <memory:8192, vCores:4>   {code}
 

This seems to be caused by this code in ManagedParentQueue.getLeafQueueConfigs:
{code:java}
CapacitySchedulerConfiguration leafQueueConfigTemplate = new
    CapacitySchedulerConfiguration(new Configuration(false), false);{code}
 

This initializes a new leaf queue configuration that does not read resource-types.xml (or
any other config). Later, this CapacitySchedulerConfiguration instance calls ResourceUtils.fetchMaximumAllocationFromConfig() 
from its getMaximumAllocationPerQueue() method and passes itself as the configuration to use.
Since the resource types are not present, ResourceUtils falls back to compiled-in defaults
of 8GB RAM, 4 cores.

 

I was able to work around this with a custom AutoCreatedQueueManagementPolicy implementation
which does something like this in init() and reinitialize():
{code:java}
for (Map.Entry<String, String> entry : this.scheduler.getConfiguration()) {
if (entry.getKey().startsWith("yarn.resource-types")) {
  parentQueue.getLeafQueueTemplate().getLeafQueueConfigs()
    .set(entry.getKey(), entry.getValue());
  }
}
{code}
However, this is obviously a very hacky way to solve the problem.

I can submit a proper patch if someone can provide some direction as to the best way to proceed.

 

  was:
Auto-created leaf queues do not honor cluster-wide settings for maximum CPU/vcores allocation.

To reproduce:
 # Set auto-create-child-queue.enabled=true for a parent queue.
 # Set leaf-queue-template.maximum-allocation-mb=16384.
 # Set yarn.resource-types.memory-mb.maximum-allocation=16384 in resource-types.xml

 # Launch a YARN app with a container requesting 16 GB RAM.

This scenario should work, but instead you get an error similar to this:

 

{{java.lang.IllegalArgumentException: Queue maximum allocation cannot be larger than the cluster
setting for queue root.auto.test max allocation per queue: <memory:16384, vCores:1>
cluster setting: <memory:8192, vCores:4>}}

 

This seems to be caused by this code in ManagedParentQueue.getLeafQueueConfigs:
{code:java}
CapacitySchedulerConfiguration leafQueueConfigTemplate = new
    CapacitySchedulerConfiguration(new Configuration(false), false);{code}
This initializes a new leaf queue configuration that does not read resource-types.xml (or
any other config). Later, this CapacitySchedulerConfiguration instance calls ResourceUtils.fetchMaximumAllocationFromConfig() 
from its getMaximumAllocationPerQueue() method and passes itself as the configuration to use.
Since the resource types are not present, ResourceUtils falls back to compiled-in defaults
of 8GB RAM, 4 cores.

 

I was able to work around this with a custom AutoCreatedQueueManagementPolicy implementation
which does something like this in init() and reinitialize():
{code:java}
for (Map.Entry<String, String> entry : this.scheduler.getConfiguration()) {
if (entry.getKey().startsWith("yarn.resource-types")) {
  parentQueue.getLeafQueueTemplate().getLeafQueueConfigs()
    .set(entry.getKey(), entry.getValue());
  }
}
{code}
However, this is obviously a very hacky way to solve the problem.

I can submit a proper patch if someone can provide some direction as to the best way to proceed.

 


> Auto-created leaf queues do not honor cluster-wide min/max memory/vcores
> ------------------------------------------------------------------------
>
>                 Key: YARN-9569
>                 URL: https://issues.apache.org/jira/browse/YARN-9569
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler
>    Affects Versions: 3.2.0
>            Reporter: Craig Condit
>            Priority: Major
>
> Auto-created leaf queues do not honor cluster-wide settings for maximum CPU/vcores allocation.
> To reproduce:
>  # Set auto-create-child-queue.enabled=true for a parent queue.
>  # Set leaf-queue-template.maximum-allocation-mb=16384.
>  # Set yarn.resource-types.memory-mb.maximum-allocation=16384 in resource-types.xml
>  # Launch a YARN app with a container requesting 16 GB RAM.
>  
> This scenario should work, but instead you get an error similar to this:
> {code:java}
> java.lang.IllegalArgumentException: Queue maximum allocation cannot be larger than the
cluster setting for queue root.auto.test max allocation per queue: <memory:16384, vCores:1>
cluster setting: <memory:8192, vCores:4>   {code}
>  
> This seems to be caused by this code in ManagedParentQueue.getLeafQueueConfigs:
> {code:java}
> CapacitySchedulerConfiguration leafQueueConfigTemplate = new
>     CapacitySchedulerConfiguration(new Configuration(false), false);{code}
>  
> This initializes a new leaf queue configuration that does not read resource-types.xml
(or any other config). Later, this CapacitySchedulerConfiguration instance calls ResourceUtils.fetchMaximumAllocationFromConfig() 
from its getMaximumAllocationPerQueue() method and passes itself as the configuration to use.
Since the resource types are not present, ResourceUtils falls back to compiled-in defaults
of 8GB RAM, 4 cores.
>  
> I was able to work around this with a custom AutoCreatedQueueManagementPolicy implementation
which does something like this in init() and reinitialize():
> {code:java}
> for (Map.Entry<String, String> entry : this.scheduler.getConfiguration()) {
> if (entry.getKey().startsWith("yarn.resource-types")) {
>   parentQueue.getLeafQueueTemplate().getLeafQueueConfigs()
>     .set(entry.getKey(), entry.getValue());
>   }
> }
> {code}
> However, this is obviously a very hacky way to solve the problem.
> I can submit a proper patch if someone can provide some direction as to the best way
to proceed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message