hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels
Date Wed, 30 Aug 2017 17:42:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147663#comment-16147663
] 

Daniel Templeton edited comment on YARN-2497 at 8/30/17 5:41 PM:
-----------------------------------------------------------------

bq. We should guarantee that a queue with non-label cannot access a node with a label

Agreed, and the current patch does that.  What I still have to figure out is how to sensibly
assign a queue labels and no label.  In capacity scheduler, all queues can access nodes with
no label.  I'm not sure that's the best approach.  For example, assume I have a GPU label,
and I want to make sure that any app requesting nodes with the GPU label is scheduled as a
priority (because my GPU card is an expensive resource that I want to see maximally used).
 I therefore create a GPU queue and give that queue a very high weight.  If that queue also
allowed apps with no label, then I could submit non-GPU jobs to that queue just to boost my
priority.  On the other hand, I want to be able to submit an app that uses no label for the
AM so that I don't consume GPU resources for no reason.  I still need to ponder that one a
little.

Multiple labels are explicitly not supported because of the chaos that would create.  Instead
see YARN-3409.

I do not intend to tackle relaxed partitions in this patch.  That's a much trickier implementation
that requires delayed scheduling.  Feel free to file a JIRA for it and work on it.

I will be testing failover for node labels, but I don't see any reason why it shouldn't work
as is.

bq. A queue should only access one label

That doesn't work.  Because an app can only be in one queue at a time, in order for an app
to use different labels for different containers, the queue must support multiple labels.
 A primary use case is as I mentioned above, an AM that doesn't want to consume a limited
resource that its tasks will need.  I don't like it either, but I don't see another way around
it.


was (Author: templedf):
bq. We should guarantee that a queue with non-label cannot access a node with a label

Agreed, and the current patch does that.  What I still have to figure out is how to sensibly
assign a queue labels and no label.  In capacity scheduler, all queues can access nodes with
no label.  I'm not sure that's the best approach.  For example, assume I have a GPU label,
and I want to make sure that any app requesting nodes with the GPU label is scheduled as a
priority (because my GPU card is an expensive resource that I want to see maximally used).
 I therefore create a GPU queue and give that queue a very high weight.  If that queue also
allowed apps with no label, then I could submit non-GPU jobs to that queue just to boost my
priority.  On the other hand, I want to be able to submit an app that uses no label for the
AM so that I don't consume GPU resources for no reason.  I still need to ponder that one a
little.

Multiple labels are explicitly not supported because of the chaos that would create.  Instead
see YARN-3409.

I do not intend to tackle relaxed partitions in this patch.  That's a much trickier implementation
that requires delayed scheduling.

I will be testing failover for node labels, but I don't see any reason why it shouldn't work
as is.

bq. A queue should only access one label

That doesn't work.  Because an app can only be in one queue at a time, in order for an app
to use different labels for different containers, the queue must support multiple labels.
 A primary use case is as I mentioned above, an AM that doesn't want to consume a limit resource
that its tasks will need.  I don't like it either, but I don't see another way around it.

> Changes for fair scheduler to support allocate resource respect labels
> ----------------------------------------------------------------------
>
>                 Key: YARN-2497
>                 URL: https://issues.apache.org/jira/browse/YARN-2497
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler
>            Reporter: Wangda Tan
>            Assignee: Daniel Templeton
>         Attachments: YARN-2499.WIP01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message