hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-371) Consolidate resource requests in AM-RM heartbeat
Date Mon, 04 Feb 2013 15:00:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570314#comment-13570314

Arun C Murthy commented on YARN-371:

Looks like there's a misunderstanding here - Sandy talks about reducing the memory requirements
of the RM. If I understand the proposal correctly, the number of resource request objects
sent by the AM in MR would be reduced from five (three node-local, one rack-local, one ANY)
to one resource request with an array of locations (host names) of length five.

Please read my explanation again. 

This change is *explicitly* against the design goals of YARN ResourceManager and would increase
memory requirements of RM by a couple of orders of magnitude.

Hadoop MR applications, routinely, have 100K+ tasks. The proposed change in this jira would
require 100K+ resource-requests (one per task). Currently, in YARN, that can be expressed
in O(nodes + racks + 1) resource-requests, which is ~O(5000) on even the largest clusters
known today. 

So, in effect, this change would be a significant regression and result in 100,000 resource-requests
v/s ~5000 needed today.

bq. BTW Arun, immediately vetoing an issue in the first comment is not conducive to a balanced

Tom - You can read it as a veto, or you can read it as *I strongly disagree since this is
against the goals of the project and a significant regression*. IAC, we should allow for people's
communication style... and keep discussions technical - I'd appreciate that.
> Consolidate resource requests in AM-RM heartbeat
> ------------------------------------------------
>                 Key: YARN-371
>                 URL: https://issues.apache.org/jira/browse/YARN-371
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: api, resourcemanager, scheduler
>    Affects Versions: 2.0.2-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
> Each AMRM heartbeat consists of a list of resource requests. Currently, each resource
request consists of a container count, a resource vector, and a location, which may be a node,
a rack, or "*". When an application wishes to request a task run in multiple localtions, it
must issue a request for each location.  This means that for a node-local task, it must issue
three requests, one at the node-level, one at the rack-level, and one with * (any). These
requests are not linked with each other, so when a container is allocated for one of them,
the RM has no way of knowing which others to get rid of. When a node-local container is allocated,
this is handled by decrementing the number of requests on that node's rack and in *. But when
the scheduler allocates a task with a node-local request on its rack, the request on the node
is left there.  This can cause delay-scheduling to try to assign a container on a node that
nobody cares about anymore.
> Additionally, unless I am missing something, the current model does not allow requests
for containers only on a specific node or specific rack. While this is not a use case for
MapReduce currently, it is conceivable that it might be something useful to support in the
future, for example to schedule long-running services that persist state in a particular location,
or for applications that generally care less about latency than data-locality.
> Lastly, the ability to understand which requests are for the same task will possibly
allow future schedulers to make more intelligent scheduling decisions, as well as permit a
more exact understanding of request load.
> I would propose the tweak of allowing a single ResourceRequest to encapsulate all the
location information for a task.  So instead of just a single location, a ResourceRequest
would contain an array of locations, including nodes that it would be happy with, racks that
it would be happy with, and possibly *.  Side effects of this change would be a reduction
in the amount of data that needs to be transferred in a heartbeat, as well in as the RM's
memory footprint, becaused what used to be different requests for the same task are now able
to share some common data.
> While this change breaks compatibility, if it is going to happen, it makes sense to do
it now, before YARN becomes beta.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message