hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gergo Repas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart
Date Tue, 22 May 2018 12:27:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483864#comment-16483864
] 

Gergo Repas commented on YARN-8191:
-----------------------------------

[~haibochen] Thanks for the review!
1) - Good point, I fixed it.
2) - This logic's origin is a suggestion from [~wilfreds] (Wilfred - please correct me if
I'm wrong about the intentions behind {{getRemovedStaticQueues(), setQueuesToDynamic()}}).
The point here is that the set of removed queues can be gathered in {{AllocationReloadListener.onReload()}}
outside of the writeLock. It's safe to do so because onReload() is only called from the synchronized
{{AllocationFileLoaderService.reloadAllocations()}} method. This way the {{AllocationReloadListener.getRemovedStaticQueues()}}
logic is subject to the least amount of locking. The thread safety was indeed missing for
{{QueueManager.setQueuesToDynamic()}}, I've added the missing synchronized block.
3) Sorry, what do you mean by "What about the other case where some dynamic queues are not
added as static in the new allocation file?". If you mean dynamic queue creation via application
submission, the test case for this (+the removal) is {{TestQueueManager.testRemovalOfDynamicLeafQueue()}}.
4-5) I have refactored this part of the code, removed getIncompatibleQueueName() and changed
only the return type of removeEmptyIncompatibleQueues() to indicate if there was no queue
that's been tried to be removed.
6) {{updateAllocationConfiguration()}} is only called when the configuration file has been
modified, so if for example there's only one configuration modification during the lifetime
of the RM, incompatible queues would not be cleaned up until a restart.

> Fair scheduler: queue deletion without RM restart
> -------------------------------------------------
>
>                 Key: YARN-8191
>                 URL: https://issues.apache.org/jira/browse/YARN-8191
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 3.0.1
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>         Attachments: Queue Deletion in Fair Scheduler.pdf, YARN-8191.000.patch, YARN-8191.001.patch,
YARN-8191.002.patch, YARN-8191.003.patch, YARN-8191.004.patch, YARN-8191.005.patch, YARN-8191.006.patch,
YARN-8191.007.patch, YARN-8191.008.patch, YARN-8191.009.patch, YARN-8191.010.patch, YARN-8191.011.patch,
YARN-8191.012.patch, YARN-8191.013.patch
>
>
> The Fair Scheduler never cleans up queues even if they are deleted in the allocation
file, or were dynamically created and are never going to be used again. Queues always remain
in memory which leads to two following issues.
>  # Steady fairshares aren’t calculated correctly due to remaining queues
>  # WebUI shows deleted queues, which is confusing for users (YARN-4022).
> We want to support proper queue deletion without restarting the Resource Manager:
>  # Static queues without any entries that are removed from fair-scheduler.xml should
be deleted from memory.
>  # Dynamic queues without any entries should be deleted.
>  # RM Web UI should only show the queues defined in the scheduler at that point in time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message