hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4412) Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers
Date Tue, 16 Feb 2016 08:24:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148261#comment-15148261
] 

Arun Suresh commented on YARN-4412:
-----------------------------------

Many thanks for the detailed review [~curino].

# I totally agree with your point in explicitly authorizing AMs to allow them to send and
receive cluster information via the extended protocol : YARN-4631 has been raised to track
this.
# With regard to generalizing {{QueuedContainersStatus}} into a {{ClusterStatus}}, Please
note.. this is actually metadata sent from the NM to the RM, therefore *ClusterStatus* might
not apply here. But I agree, we probably can add more cluster information to the {{DistributedSchedulingProtocol}},
which we introduced in YARN-2885. Also the node heartbeat does already contain both Container
as well as aggregate Node resource utilization information. {{QueuedContainersStatus}} is
just another utilization metric used by the {{ClusterMonitor}} running on the RM and used
by the DistributedScheduling framework to gauge the relative load on a Node based on the state
of the queue (maintained by the {{ContainersMonitor}} which queues OPPORTUNISTICS container
requests) 

bq.  ..documentation on the various classes would help. e.g., you introduce a DistributedSchedulingService,
..
Agreed, I have added some class level docs to some of the new classes introduced here.

bq. ... if you are factoring out all the "guts" of SchedulerEventDispatcher, can't we simply
move the class out? ..
Agreed.. 

bq. Can you clarify what happens in DistributedSchedulingService.getServer() ?...
Fixed the comment to explain this.

bq. ..assumes resources will have only cpu/mem...Is there any better way to load this info
from configuration? It would be nice to have a config.getResource("blah"), which takes care
of this...
Good point.. unfortunately, currently the Configuration object does not support {{getResource()}}..
Once the generalized resource model lands, will circle back to this.

bq. I see tests for TopKNodeSelector, but for nothing else. Is this enough?
Definitely not.. but we have to wait for the actual changes in the {{ContainerManager}} and
{{ContainersMonitor}} class, handled in YARN-2883 to test this end-to-end. In the mean time,
I will add tests to verify that extra fields in the protobuff are handled correctly.


> Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-4412
>                 URL: https://issues.apache.org/jira/browse/YARN-4412
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-4412-yarn-2877.v1.patch, YARN-4412-yarn-2877.v2.patch
>
>
> Introduce a Cluster Monitor that aggregates load information from individual Node Managers
and computes an ordered list of preferred Node managers to be used as target Nodes for OPPORTUNISTIC
container allocations. 
> This list can be pushed out to the Node Manager (specifically the AMRMProxy running on
the Node) via the Allocate Response. This will be used to make local Scheduling decisions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message