hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart
Date Wed, 04 Jan 2017 21:29:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15799398#comment-15799398
] 

Naganarasimha G R commented on YARN-5556:
-----------------------------------------

Hi [~xgong] and [~wangda],
I have few queries based on the design document attached in YARN-5724,
# So user needs to delete a queue(say a2) then he needs to remove the queue from its parent's
"yarn.scheduler.capacity.<parent queue>.queues" config and also mention its state(yarn.scheduler.capacity.<root...a2>.state)
as {{DELETED}} right ?
# How to delete intermediate queues? i presume we need *NOT* configure state for each of its
children right ? or do we plan to support delete of only leaf queue?
# Do we need to consider the moving of queues(along with its apps) from one queue hiearchy
to another ? IMO it complicates but not sure about the real world usecases.
# In case of HA, i think it further complicates as if both the RM's are initialiased with
old queue settings and then if new queue is updated then CS is aware of deleted queue else
if the RM starts of with updated xml(with deleted queue) then deleted queue information is
not available and if failover happens to this RM then apps running on the deleted queue cannot
be recovered as the queue doesnt exist. so do we need to start maintaining the deleted queue
in statestore or need handling of creating queue objects for the queues whose state has been
marked as deleted (then we need to consider 2nd point) ?
# More of a test scenario, i have a queue with apps running, now i delete the queue which
will make it go into drain state (or new state as Deleted but queue is not deleted until all
apps under it are finished) but apps take some time to finish and now xml is again updated
with new queue which has same name and path as of one which was deleted earlier, so do we
need to support addition of this new queue or dont allow if earlier queue is in process of
deletion ?
# do we need to consider showing of the deleted queues in the webui ? may be in another jira
but the code needs to be updated. 
# for the below comment in the doc :
bq. "For the resources of the deleted/stopped queue, users should explicitly distribute them
away to its siblings."
* While we allow running apps to complete do we allow pending container requests to be catered
for these apps ? if so deleted queue's capacity is considered to be 0% and max cap as per
its config ?
* if deleted queue's capacity is not explicitly redistributed to its siblings i presume we
need to throw exception ?
* Do we need to consider preemption of resources from these deleted queues when there is shortage
of resources?
* What should happen to the pending(not activated) apps in the queue, kill or give a chance
to complere like running apps ?

I can port/rebase based on my original patch (which just goes ahead and deletes the queue
if apps are not present), but i presume that scope has got changed hence once these points
are clarified we can decide the scope of this jira and then will work upon it.

> Support for deleting queues without requiring a RM restart
> ----------------------------------------------------------
>
>                 Key: YARN-5556
>                 URL: https://issues.apache.org/jira/browse/YARN-5556
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Xuan Gong
>            Assignee: Naganarasimha G R
>         Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, YARN-5556.v1.003.patch,
YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS refresh. But
for deleting queue, we have to restart the ResourceManager. We could support for deleting
queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message