hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szilard Nemeth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4022) queue not remove from webpage(/cluster/scheduler) when delete queue in xxx-scheduler.xml
Date Mon, 22 Jan 2018 10:32:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334116#comment-16334116

Szilard Nemeth commented on YARN-4022:

Hey @danieltempleton, @yufei!

Could you please help me a bit with this one?

I left the _{{yarn.scheduler.fair.user-as-default-queue}}_ and _{{yarn.scheduler.fair.allow-undeclared-pools}}_
configs on their default values (true) as based on the FairScheduler page ([https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html]),
this makes the most sense to have dynamic queues created.
 When I started a pi job, the job was assigned to a dynamically created queue, namely "root.szilardnemeth"
so I guess the above config is correct.
 When I ran {{yarn rmadmin -refreshQueues}} and checked the RM Webservices API with a GET
request to "ws/v1/cluster/scheduler", the queue was still there.
 After that, I debugged the calls described below and found out that the queues are not deleted
when refreshQueues is invoked even if they are empty.

Eventually, {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#removeEmptyIncompatibleQueues}}
is invoked but this method does not delete my dynamically created leaf queue, moreover this
method does not seem to be a good fit to add queue removal functionality, since it only deals
with incompatible queues.

I found out that when I start the command {{yarn rmadmin -refreshQueues}}, the following relevant
calls are performed:

1. {{AdminService.refreshQueues}} handles the CLI command
 2. {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler#reinitialize}}
is invoked
 3. The method above invokes {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService#reloadAllocations}}
which loads the allocations.xml file.
 At the end of this method, a call happens to the {{reloadListener}} with the parsed configuration
object: {{reloadListener.onReload(info);}}
 4. {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.AllocationReloadListener#onReload}}
is invoked.
 5. The method above invokes {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#updateAllocationConfiguration}}
 This method is responsible for removing incompatible queues, see {{removeEmptyIncompatibleQueues}}
in {{QueueManager}} (at the time of writing: [https://github.com/apache/hadoop/blob/99292adcefdc6b8f280b8e100605fb39f755c38a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java#L351])

*For me, adding the queue removal functionality to FairScheduler.reinitialize would be the
most logical thing to do, as the rest of the methods are strongly related to reading the allocations
file and since dynamically created queues are not based on that file, it is a "separate entity". *

My questions: 
 1. Should all empty dynamically created queues be removed when the refreshQueues command
is invoked with the CLI?
 2. May all empty queues be removed when refreshQueues command is invoked or just the dynamically
created ones?
 3. If the answer is "just the dynamically created queues can be removed" for question 2,
how can I differentiate the normal queues from the dynamically created queues?

> queue not remove from webpage(/cluster/scheduler) when delete queue in xxx-scheduler.xml
> ----------------------------------------------------------------------------------------
>                 Key: YARN-4022
>                 URL: https://issues.apache.org/jira/browse/YARN-4022
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: forrestchen
>            Assignee: Szilard Nemeth
>            Priority: Major
>              Labels: oct16-medium, scheduler
>         Attachments: YARN-4022.001.patch, YARN-4022.002.patch, YARN-4022.003.patch, YARN-4022.004.patch
> When I delete an existing queue by modify the xxx-schedule.xml, I can still see the queue
information block in webpage(/cluster/scheduler) though the 'Min Resources' items all become
to zero and have no item of 'Max Running Applications'.
> I can still submit an application to the deleted queue and the application will run using
'root.default' queue instead, but submit to an un-exist queue will cause an exception.
> My expectation is the deleted queue will not displayed in webpage and submit application
to the deleted queue will act just like the queue doesn't exist.
> PS: There's no application running in the queue I delete.
> Some related config in yarn-site.xml:
> {code}
> <property>
>         <name>yarn.scheduler.fair.user-as-default-queue</name>
>         <value>false</value>
> </property>
> <property>
>         <name>yarn.scheduler.fair.allow-undeclared-pools</name>
>         <value>false</value>
> </property>
> {code}
> a related question is here: http://stackoverflow.com/questions/26488564/hadoop-yarn-why-the-queue-cannot-be-deleted-after-i-revise-my-fair-scheduler-xm

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message