hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gergo Repas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart
Date Thu, 10 May 2018 15:45:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470570#comment-16470570
] 

Gergo Repas commented on YARN-8191:
-----------------------------------

[~wilfreds] Thank you for the review! I have a question regarding two points, and I have addressed
the rest of the points:
* Can we add a test for this case please? I do not see this specific case in the tests covered,
testNonEmptyStaticQueueBecomingDynamicQueue does not cover this use case 
** I have added testRemovalOfChildlessParentQueue() to test this scenario.
* We already check all queues defined in the configuration on each reload for existence in
the updateAllocationConfiguration via the call to removeEmptyIncompatibleQueues. If the queue
of the correct type exists we currently just return. The only thing we should do now before
we return is unset the isDynamic flag there and there is no need for a separate loop. removeEmptyIncompatibleQueues
is called for each configured queue with each reload.
** Sorry, I do not understand what you're suggesting here. Could you please elaborate a bit
more?
* Keeping the same list in two places does not make sense. You now need to keep them in sync
and it complicates the code in updateAllocationConfiguration which is not needed. We have
also seen clusters with 100's of dynamic queues when user based queues are used so the loops
can become long. Please move the logic for generating the diff to the onReload method where
we have the 2 copies as described above.
** I moved the logic into onReload.
* In assertNotNull should use the q1 object that is returned, multiple tests
** Fixed.
* testRemovalOfDynamicLeafQueue needs to cover a dynamic leaf queue removal which has a static
parent, and make sure the parent is still there. It should also check if a non empty queue
is not removed.
** Good point, fixed.
* testRemovalOfDynamicParentQueue needs to cover a dynamic parent queue without a leaf is
removal.
** How can I create a dynamic parent queue without a leaf? I thought the only way to have
a parent queue without a leaf is to add it to the allocation config with parent="true", but
in this case it'd be a static queue.
* testNonEmptyDynamicQueueBecomingStaticQueue is missing a check after the configuration update
for isDynamic or not. The test name does not really cover the test you do unless you do that.
The queue being empty or not does not really matter.
** Good point, fixed.
* testNonEmptyStaticQueueBecomingDynamicQueue: "root.test.childA" is shown as "root.queue1
is not a static queue" in the assert message. Check the order and comments in the method,
they seem a bit out of order.
** Thanks, fixed.
* We need to cover a parent queue in the tests: dynamic parent + dynamic leaf (no apps) changing
to static. Leaf and parent must be static after the change and the other way around (testQueueTypeChange
?)
** Good idea, I've added testQueueTypeChange().
* We need a test for the updating of the assignedApp in the FSLeafQueue and make sure isEmpty
is working OK.
** Right, I've added testApplicationAssignmentPreventsRemovalOfDynamicQueue() (I'm open for
suggestions regarding the naming).


> Fair scheduler: queue deletion without RM restart
> -------------------------------------------------
>
>                 Key: YARN-8191
>                 URL: https://issues.apache.org/jira/browse/YARN-8191
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 3.0.1
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>         Attachments: Queue Deletion in Fair Scheduler.pdf, YARN-8191.000.patch, YARN-8191.001.patch,
YARN-8191.002.patch, YARN-8191.003.patch, YARN-8191.004.patch, YARN-8191.005.patch, YARN-8191.006.patch,
YARN-8191.007.patch, YARN-8191.008.patch, YARN-8191.009.patch
>
>
> The Fair Scheduler never cleans up queues even if they are deleted in the allocation
file, or were dynamically created and are never going to be used again. Queues always remain
in memory which leads to two following issues.
>  # Steady fairshares aren’t calculated correctly due to remaining queues
>  # WebUI shows deleted queues, which is confusing for users (YARN-4022).
> We want to support proper queue deletion without restarting the Resource Manager:
>  # Static queues without any entries that are removed from fair-scheduler.xml should
be deleted from memory.
>  # Dynamic queues without any entries should be deleted.
>  # RM Web UI should only show the queues defined in the scheduler at that point in time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message