hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4412) Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers
Date Wed, 10 Feb 2016 18:13:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141349#comment-15141349
] 

Carlo Curino commented on YARN-4412:
------------------------------------

Hi [~asuresh], I really like the direction of the patch. I had several conversations with
folks writing "smarter" applications, which are asking for some "visibility" in what happens
in the cluster. 
If I understand the structure correctly, with what you do we could either: 
 # pass this extra info between RM and AMRMProxy, but stripping them out before it reach the
app (thus preserving existing behavior, and hiding potentially sensitive info about the cluster
load).
 # have the AMRMProxy forward this info to a "smarter" app. 

I think it is important to be able to *enforce* the above behaviors, i.e., a sneaky AM should
not be able to talk directly with the RM, pretending to be the AMRMProxy and grabbing those
extra fields. This could be accomplished with 
an extra round of "tokens" that allow to talk the extended protocol vs just the basic one.
A trusted app can receive this tokens, while a untrusted app, will not. The AMRMProxy is part
of infrastructure, so should have this special tokens.
Does this make sense?

Given the above demand from app writers, I think it would be nice to *generalize* what you
are doing. It would be nice to have a more general purpose {{ClusterStatus}} object to be
passed down. A specific instantiation of which is your 
{{QueuedContainersStatus}}, which specifically returns a top-k of queueing behavior at nodes.
Just to make an example, I can easily see a latency-critical serving service trying to figure
out where best to place its tasks, to ask for information 
about average CPU/NET/DISK utilization of all available nodes, before requesting to run on
a few that are (according to this service custom metrics) the best fit. This shouldn't be
too hard, I am just proposing a more general wrapper object,
which can allows us later on to leverage this very same mechanism, for more than what you
guys do today. I think it would make it a very valuable service to provide to apps writers,
especially as we head towards more and more services.

Nits: 
 * I think documentation on the various classes would help. e.g., you introduce a {{DistributedSchedulingService}},
that from other discussions I understand is useful, but just staring at the code is hard to
get why we need all this.
 * In {{ResourceManager}}, if you are factoring out all the "guts" of SchedulerEventDispatcher,
can't we simply move the class out? There is nothing left in RM, other than a local rename?

 * Can you clarify what happens in {{DistributedSchedulingService.getServer()}} ? The comment
has double negations and I am not clear on what what a reflectiveBlockingService does. 
 * {{registerApplicationMasterForDistributedScheduling}} assumes resources will have only
cpu/mem. This might change soon. Is there any better way to load this info from configuration?
It would be nice to have  a config.getResource("blah"), which takes care of this.
 * I see tests for {{TopKNodeSelector}}, but for nothing else. Is this enough? 


> Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-4412
>                 URL: https://issues.apache.org/jira/browse/YARN-4412
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-4412-yarn-2877.v1.patch, YARN-4412-yarn-2877.v2.patch
>
>
> Introduce a Cluster Monitor that aggregates load information from individual Node Managers
and computes an ordered list of preferred Node managers to be used as target Nodes for OPPORTUNISTIC
container allocations. 
> This list can be pushed out to the Node Manager (specifically the AMRMProxy running on
the Node) via the Allocate Response. This will be used to make local Scheduling decisions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message