hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
Date Tue, 10 Mar 2015 05:29:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354362#comment-14354362

Naganarasimha G R commented on YARN-2495:

Hi [~wangda],
1) IMO method name was not readable when it was {{setAreNodeLabelsSet}} but i have changed
it to {{setAreNodeLabelsSetInReq}} i feel this is sufficient. setAreNodeLabelsUpdated is same
as earlier for which Craig had commented (which i also feel valid) 
 I would go with areNodeLablesSet (all "isNodeLabels" => "areNodeLabels" wherever it appears,
actually) - wrt "Set" vs "Updated" - this is primarily a workaround for the null/empty ambiguity
and I think this name better reflects what is really going on (am I sending a value to act
on or not), but I also think that this is a better contract, the receiver (rm) shouldn't really
care about the logic the nm side is using to decide whether or not to set it's labels (freshness,
"updatedness", whatever), so all that should be communicated in the api is whether or not
the value is set, not whether it's an update/whether it's checking freshness, etc. that's
a nit, but I think it's a clearer name.
 Yes true lets finalize the name this time after that will start working on the patch if not
it will be a wasted effort
It will be problematic to ask admins make NM/RM configuration keep synchronized, so I don't
want (and also not necessary) NM depends on RM's configuration.
So I suggest to make a changes: In NodeManager.java: when user doesn't configure provider,
it should be null. In your patch, you can return a null directly, and YARN-2729 will implement
the logic of instancing provider from config. In NodeStatusUpdaterImpl: avoid using isDistributedNodeLabelsConf,
since we will not have "distributedNodeLabelConf" in NM side if you agree on previously comment,
instead, it will check null of provider.
Well modifications side is clear to me but is it good to allow the configurations being different
from NM and RM ? Infact i wanted to discuss regarding whether to send shutdown during register
if NM is configured differently from RM, but waited for the base changes to go in before discussing
new stuff.

8) ??You can add an additional comments in line 626 for this.?? Ok will add a comment in LabelProvider.getLabels
, Idea is LabelProvider is expected to give same Labels continiously untill there is a change
and if null or empty is returned then No label is assumed

10) {{updateNodeLabelsInNodeLabelsManager -> updateNodeLabelsFromNMReport}} : will take
care in next patch
{{LOG.info(... accepted from RM, use LOG.debug and check isDebugEnabled.}} : I feel better
to Log this as "Error" as we are sending the labels only in case of any change and there has
to be some way to identify if labels for a given NM and also currently we are sending out
shutdown signal too.

??Make errorMessage clear: indicate 1# this is node labels reported from NM, and 2# it's failed
to be put to RM instead of "not properly configured".??
i think i have captured first point, but any way will reframe it as {{"Node Labels <labels>
reported from the NM with id <nodeID> were rejected from RM  with exception message
as <exceptionMsg>.}}

??Another thing we should do is, when distributed node label configuration is set, any direct
modify node to labels mapping from RMAdminCLI should be rejected (like -replaceNodeToLabels).??
Will work on this once 2495 and 2729 are done ..

Thanks [~vinodkv] & [~cwelch] for reviewing it 
??configuration.type -> configuration-type?? will take care in next patch
Should RegisterNodeManagerRequestProto.nodeLabels be a set instead? 
Do we really need NodeHeartbeatRequest.areNodeLabelsSetInReq()? Why not just look at the set
as mentioned in the previous comment?
Well as craig informed, RegisterNodeManagerRequestProto.nodeLabels  is already a set but as
by default empty set is provided by protoc, its req to inform whether labels are set as part
of request hence areNodeLabelsSetInReq is required.
??RegisterNodeManagerRequest is getting changed. It will be interesting to reason about rolling-upgrades
in this scenario.??
Well though i am not much aware of Rolling upgrades, i don't see any problems in a normal
case because RM tries to read the labels from NM's req only when its distributed conf and
also {{areNodeLabelsSetInReq}} is by default false. 
But I had queries when some existing setup they want to modify to distributed conf setup 
# Whether we need to send shutdown during register if NM is configured differently from RM
# Will the new configurations be added in NM and RM and then Rolling upgrade will be done
? or we do rolling upgrade first and then reconfigure & restart RM's and NM's

??How about we simply things? Instead of accepting labels on both registration and heartbeat,
why not restrict it to be just during registration??
Well i have the same opinion of Wangda's on this point i feel both the flows we should take
the same decision.

??We should not even accept a node's registration when it reports invalid labels??
Well i am little confused here, As per wangda's earlier comment i understand that it was your
comment to send shutdown (which i felt correct in terms of maintenance)
One more suggestion (as per suggested by Vinod Kumar Vavilapalli), when there's anything wrong
with node label reported from NM, we should fail NM (ask it to shutdown and give it proper
diagnostic message). This is because if NM report a label but rejected, even if RM tell NM
this, NM cannot handle it properly except print some error messages (we don't have smart logic
now). Which will lead to problems in debugging (A NM reported some label to RM but scheduler
failed allocating containers on the NM). To avoid it, a simple way is to shutdown the NM and
admin can take a look at what happened.

> Allow admin specify labels from each NM (Distributed configuration)
> -------------------------------------------------------------------
>                 Key: YARN-2495
>                 URL: https://issues.apache.org/jira/browse/YARN-2495
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch,
YARN-2495.20141031-1.patch, YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, YARN-2495.20141204-1.patch,
YARN-2495.20141208-1.patch, YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, YARN-2495_20141022.1.patch
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using script
suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels

This message was sent by Atlassian JIRA

View raw message