hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1495) Allow moving apps between queues
Date Fri, 13 Dec 2013 03:17:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847108#comment-13847108

Sandy Ryza commented on YARN-1495:

Thanks for taking a look Vinod.

bq. Any specific use-case? Example where it can be used? To justify this isn't feature creep.
Yeah, we've seen requests for this a few times.  I think the most common scenario is that
someone experiences job slowly because of the queue that it's in and the job needs to be placed
in a queue where it can complete more quickly.  This can occur because it's taking longer
than expected and a deadline is approaching, the original queue is fuller than expected, the
job was submitted incorrectly in the first place but has made some progress, or for a number
of other reasons.

bq. What happens when scheduling-constraints are violated? The client will just get an error?
It kind of depends on the type of scheduling constraint.
Not sure how this should play out for the Capacity Scheduler, but for the Fair Scheduler constraints
I mentioned in the description I think the client should get an error. I suppose another option
would be to kill containers until the constraints would be satisfied, but I think this is
a lot more work and not clearly better behavior.

bq. Who initiates the move any regular user or just admins?
My opinion is any regular user, within ACLs.  I.e. if I could kill my job and resubmit it
to a different queue, I should be able to move it.

bq. Only running apps can be moved?
I don't see a reason that we shouldn't be able to move an app that has been submitted, but
not accepted, or that is very close to completion.  In some cases we may not need to touch
the scheduler.  There are definitely race conditions we need to be careful of here.

bq. Apps may be in the process of submitting new requests. What happens to them? I guess queue-move
and new-requests should be synchronized.

> Allow moving apps between queues
> --------------------------------
>                 Key: YARN-1495
>                 URL: https://issues.apache.org/jira/browse/YARN-1495
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: scheduler
>    Affects Versions: 2.2.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
> This is an umbrella JIRA for work needed to allow moving YARN applications from one queue
to another.  The work will consist of additions in the command line options, additions in
the client RM protocol, and changes in the schedulers to support this.
> I have a picture of how this should function in the Fair Scheduler, but I'm not familiar
enough with the Capacity Scheduler for the same there.  Ultimately, the decision to whether
an application can be moved should go down to the scheduler - some schedulers may wish not
to support this at all.  However, schedulers that do support it should share some common semantics
around ACLs and what happens to running containers.
> Here is how I see the general semantics working out:
> * A move request is issued by the client.  After it gets past ACLs, the scheduler checks
whether executing the move will violate any constraints. For the Fair Scheduler, these would
be queue maxRunningApps and queue maxResources constraints
> * All running containers are transferred from the old queue to the new queue
> * All outstanding requests are transferred from the old queue to the new queue
> Here is I see the ACLs of this working out:
> * To move an app from a queue a user must have modify access on the app or administer
access on the queue
> * To move an app to a queue a user must have submit access on the queue or administer
access on the queue 

This message was sent by Atlassian JIRA

View raw message