hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests
Date Sun, 13 Jul 2014 15:07:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060133#comment-14060133
] 

Wangda Tan commented on YARN-796:
---------------------------------


Reply:
Hi Yuliya,
Thanks for your reply. it’s great to read your doc and discuss with you too. :)
Please see my reply below.

1) 
bq. What probably needs to be evaluated is what nodes satisfy a final/effective LabelExpression,
as nodes can come and go, labels on them can change
Agree, what I meant is, we need consider performance of 2 things,
* Time to evaluate a label expression, IMO we need to add labels in per container level.
* If it is important to get headroom or how many nodes can be used for an expression. The
easier expression will be easier for us to get result mentioned previously easier.

2) 
bq. Let me understand it better: If application provides multiple labels they are "AND"ed
and so only nodes that have the same set of labels or their superset will be used?
Yes, 
Why I think this is important because label is treat as a tangible resource here. Imaging
you running a HBase master, you may want the node is “stable”, “large_memory”, “for_long_running_service”.
Or you try to run a scientific computing program, you want a node has “GPU”, “large_memory”,
“strong_cpu”. It is not make sense to use “OR” in these cases.

To Sandy/Amit, do you have any specific use case for OR?
My basic feeling to support different OPs like “OR”/“NOT” here is, we may support
different OPs if they have clear use case and highly demanded. But we’d better not use a
combined expression. If we use combined expression, we need to add parentheses here, which
will increase complexity to evaluate them.
Let's hear more thoughts from community about this.


3) 
bq. Yes - so far this is a procedure. Not sure what is "hard" here, but we can have some API
to do it.
Do you have any ideas about what’s the API will like?


4)
bq. Agree - that today this file may be only relevant to RM. If it is stored as local file
or by other means it is greater chance for it to be overwritten, lost in upgrade process.
Agree

5)
bq. And if we support this, it will be not sufficient to change isBlackListed at AppSchedulingInfo
only in scheduler to make fair/capacity scheduler works. We may need to modify implementations
of different schedulers.
Agree


6)
bq. Sure we can make them consistent, our thought process was that if you have multiple leaf
queues that should share the same label/policy you can specify it on the parent level, so
you don't need to "type" more then necessary 
I think for different schedulers, we should specify queue related parameters in different
configurations. Let’s get more ideas about how to specify queue parameters from community
before move ahead. :)

Thanks,
Wangda

> Allow for (admin) labels on nodes and resource-requests
> -------------------------------------------------------
>
>                 Key: YARN-796
>                 URL: https://issues.apache.org/jira/browse/YARN-796
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun C Murthy
>            Assignee: Wangda Tan
>         Attachments: LabelBasedScheduling.pdf, Node-labels-Requirements-Design-doc-V1.pdf,
YARN-796.patch
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels are OS,
processor architecture etc.
> We should expose these labels and allow applications to specify labels on resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message