hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7117) Capacity Scheduler: Support Auto Creation of Leaf Queues While Doing Queue Mapping
Date Fri, 29 Sep 2017 20:45:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186396#comment-16186396
] 

Jason Lowe commented on YARN-7117:
----------------------------------

Thanks for providing the doc, Wangda!

I think the syntax would be more concise and easier to read if the queue could be specified
as a sub-path that can optionally include the parent queue.  For example, rather than {{u:user1:queue1(parent-queue=marketing)}}
the syntax could be simplified to: {{u:user1:marketing.queue1}}.

I'm not really familiar with queue mappings, but I'm assuming the order they are specified
is significant to deterministically resolve cases where more than one specified rule would
apply to a user.  If so then the example is confusing since it looks like the {{u:user2:%primary_group(parent-queue=finance)}}
rule will always be eclipsed by the preceding {{u:%user:%user(parent=engineering)}} rule.

{quote}
If​ ​ we​ ​ don’t​ ​ have​ ​ guaranteed​ ​ room​ ​ in​ ​ the​
​ parent​ ​ queue,​ ​ queues​ ​ with​ ​ 0 ​ ​ capacity​ ​ (best​
​ effort
queue​ ​ ) ​ ​ will​ ​ be​ ​ created.​ ​ Applications​ ​ running​
​ in​ ​ these​ ​ best​ ​ effort​ ​ queues​ ​ could​ ​ be​ ​
starving​ ​ if​ ​ no
capacity​ ​ is​ ​ available
{quote}

This conflicts with the proposal above to fail the submission because it cannot create the
queue with guarantees.  It seems weird to have user A get guaranteed capacity but user B gets
_zero_ guarantees because they were just a second later to submit than user A.  IMHO if the
admin wants to configure guarantees for auto-created queues then we should not assume that
they're going to be OK with auto-created queues that do not meet those specifications.  Otherwise
I'd assume the admin would forgo guaranteed capacities on the auto queues and just have them
carve up the parent queue proportionally.

The capacity management with guaranteed capacities refers to a "configured-threshold" but
the interface to set that threshold is not documented above.

The document implies that there's SLAs with guaranteed capacity auto-queues, but that's clearly
not the case.  In the example, it's true that the applications submitted to q4 and q5 eventually
ran with guaranteed capacities.  However they waited an unbounded amount of time to start
running which means we cannot always hit SLAs.  Users in q1/q2/q3 can collectively deny apps
in q4/q5 ever running, for example.

For the alternative approach where all of the queues are "best effort" we don't have to always
have the max-am-resource at 0%.  We could specify the max-am as a percent of the max cap for
those queues or a separate config specific to them, or whatever.  Or we could have the queues
auto-distribute the capacities of the parent as new queues are added.  In other words the
auto-queue capacity is 1/(num auto queues) of the parent and the max-capacity is always 100%.
 Preemption can be used to keep the queues fair if one user tries to dominate over the others,
but capacities of underutilized queues can be leveraged by others.

If a user has ACLs to the parent queue then I believe they have those ACLs to the entire hierarchy
of that queue.  That means if the parent queue says they can submit then they'll be able to
submit to any auto-queue underneath that parent.  We'll either need a new ACL for auto-queue
creation separate from app submission or change the semantics of ACL inheritance for auto-queues.
 Probably the former makes more sense and would be more intuitive since admins will be used
to the inheritance features of today's queue ACLs and allow admins to configure parent-queue-privileged
users that can get admin-like access to all the auto-queues of a parent queue but aren't fully
admin users across all queues.

Yes, if a user does not have the ability to create an auto-queue and/or submit then the submit
should fail.

I don't know if it's critical to show auto-queues as a different color, but I think it would
be important to be able to determine _somehow_ via the UI that the queue was auto-created
so the admin doesn't wonder why they can't find the queue in the static queue configs.  This
might be as simple as a "Auto-Queue: true/false" line in the queue details box in the UI.



> Capacity Scheduler: Support Auto Creation of Leaf Queues While Doing Queue Mapping
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-7117
>                 URL: https://issues.apache.org/jira/browse/YARN-7117
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacity scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-7117.Capacity.Scheduler.Support.Auto.Creation.Of.Leaf.Queue.pdf
>
>
> Currently Capacity Scheduler doesn't support auto creation of queues when doing queue
mapping. We saw more and more use cases which has complex queue mapping policies configured
to handle application to queues mapping. 
> The most common use case of CapacityScheduler queue mapping is to create one queue for
each user/group. However update {{capacity-scheduler.xml}} and {{RMAdmin:refreshQueues}} needs
to be done when new user/group onboard. One of the option to solve the problem is automatically
create queues when new user/group arrives.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message