hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart
Date Fri, 18 May 2018 23:38:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481341#comment-16481341

Haibo Chen commented on YARN-8191:

Thanks [~grepas] for the patch! I have some comments.

1) In AllocationFileLoaderService, instead of adding a queueManager and allowing it to be
set, we can add another method, onCheck() to AllocationFileLoaderService.Listener.  Then
listener.onCheck() can be called in each iteration of the while loop in AllocationFileLoaderService.
This approach is already taken by onReload() method, so we can simply follow it.

2) Can we move `Set<String> removedStaticQueues = getRemovedStaticQueues(queueInfo);  
queueMgr.setQueuesToDynamic(removedStaticQueues);` logic into QueueManager.updateAllocationConfiguration(),
where we handle the queue updates?  setQueuesToDynamic() is technically not thread-safe.

3) The patch currently handles the case where some static queues are removed from the new
allocation file. What about the other case where some dynamic queues are not added as static
in the new allocation file? Though I think this is less likely to happen.

4) I suppose  getIncompatibleQueueName() finds that existing queue that is incompatible with
the queueToCreate. If that's the case, we need to return queue.getName() when the type before
and after is different. That is, the code should probably
    if (queue != null) {
      if (queue instanceof FSLeafQueue) {
        if (queueType == FSQueueType.LEAF) {
          return null;
        } else {
          return queue.getName();
      } else {
        if (queueType == FSQueueType.PARENT) {
          return null;
        } else {
          return queue.getName();
5) The removeEmptyIncompatibleQueues(String queueToCreate, FSQueueType queueType, boolean
defaultReturnValue) seems complicated to me with the newly added defaultReturnValue. If you
look at removeLeafQueue(), previously it would return true if there is no incompatible queue
found. But it now returns false.  I think it illustrates well that the defaultReturnValue
adds too much mental overhead, but no obvious benefits, IMO.

6) Tracking IncompatibleQueueRemovalTask is probably unnecessary. If an incompatible queue
cannot be removed because there are some running apps at the moment, it will stay in queues
so that next time when the  updateAllocationConfiguration() is called, it will still be identified
as an incompatible queue and be removed.


> Fair scheduler: queue deletion without RM restart
> -------------------------------------------------
>                 Key: YARN-8191
>                 URL: https://issues.apache.org/jira/browse/YARN-8191
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 3.0.1
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>         Attachments: Queue Deletion in Fair Scheduler.pdf, YARN-8191.000.patch, YARN-8191.001.patch,
YARN-8191.002.patch, YARN-8191.003.patch, YARN-8191.004.patch, YARN-8191.005.patch, YARN-8191.006.patch,
YARN-8191.007.patch, YARN-8191.008.patch, YARN-8191.009.patch, YARN-8191.010.patch, YARN-8191.011.patch,
YARN-8191.012.patch, YARN-8191.013.patch
> The Fair Scheduler never cleans up queues even if they are deleted in the allocation
file, or were dynamically created and are never going to be used again. Queues always remain
in memory which leads to two following issues.
>  # Steady fairshares aren’t calculated correctly due to remaining queues
>  # WebUI shows deleted queues, which is confusing for users (YARN-4022).
> We want to support proper queue deletion without restarting the Resource Manager:
>  # Static queues without any entries that are removed from fair-scheduler.xml should
be deleted from memory.
>  # Dynamic queues without any entries should be deleted.
>  # RM Web UI should only show the queues defined in the scheduler at that point in time.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message