hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantinos Karanasos (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling
Date Sat, 22 Nov 2014 15:20:36 GMT

    [ https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221990#comment-14221990

Konstantinos Karanasos commented on YARN-2877:

[~Sujeet Varakhedi], also the Apollo paper (OSDI 2014) has interesting ideas about distributed

[~wangda], glad you like the idea and thanks for the interesting points. To answer your questions:
1. Apart from the limit that the LocalRM can impose in the number of queueable containers
that each AM can receive (for which the central RM does not need to be involved), in the heartbeat
response from the RM to the NM, information about the status of the other queues of the system
will be passed as well. This way we will be able to impose global policies (such as capacity)
in a distributed fashion. BTW this information is also used by the LocalRMs to decide in which
NMs to queue requests.
2. If no policies need to be imposed, the central RM does not need to know anything about
the queueable containers that each AM uses. Limits in the number of queueable containers per
AM can be imposed directly by the LocalRM. However, in case fine-grained policies need to
be imposed (as mentioned in point (1) above, such as number of queueable containers per queue
in the capacity scheduler), the central RM can receive information about the number of queueable
containers used by each AM, so that it imposes limits per queue. Clearly, the more information
you pass to the central RM, the more powerful policies you can impose, but also the bigger
the load you push to the central RM. So, there is a sweet-spot there based on the needs of
each cluster.
3. This is a good point as well. Such information can be piggybacked in the heartbeats to
the central RM (again, with the tradeoffs discussed above).

> Extend YARN to support distributed scheduling
> ---------------------------------------------
>                 Key: YARN-2877
>                 URL: https://issues.apache.org/jira/browse/YARN-2877
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Sriram Rao
> This is an umbrella JIRA that proposes to extend YARN to support distributed scheduling.
 Briefly, some of the motivations for distributed scheduling are the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise idle resources
on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates (i.e., task
execution time is much less compared to the time required for obtaining a container from the

This message was sent by Atlassian JIRA

View raw message