hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantinos Karanasos (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6344) Rethinking OFF_SWITCH locality in CapacityScheduler
Date Thu, 23 Mar 2017 18:35:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938973#comment-15938973

Konstantinos Karanasos commented on YARN-6344:

Thank you for the input, [~sunilg] and [~leftnoteasy].

[~sunilg], indeed we could eventually give the parameter as a percentage of the cluster nodes,
e.g., 0.25 would mean that we wait to hear from 25% of the cluster nodes before relaxing locality.
But we would then have to change the node-locality-delay to be a percentage too, in order
to be consistent. And this would mean that everybody would have to change its cluster configuration.
So, like Wangda says, maybe it is better to keep it as an absolute value for now?

[~leftnoteasy], I like your idea to put the rack-locality-delay as "additional opportunities
on top of node-locality-delay ones". I will update the patch.

> Rethinking OFF_SWITCH locality in CapacityScheduler
> ---------------------------------------------------
>                 Key: YARN-6344
>                 URL: https://issues.apache.org/jira/browse/YARN-6344
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Konstantinos Karanasos
>            Assignee: Konstantinos Karanasos
>         Attachments: YARN-6344.001.patch
> When relaxing locality from node to rack, the {{node-locality-parameter}} is used: when
scheduling opportunities for a scheduler key are more than the value of this parameter, we
relax locality and try to assign the container to a node in the corresponding rack.
> On the other hand, when relaxing locality to off-switch (i.e., assign the container anywhere
in the cluster), we are using a {{localityWaitFactor}}, which is computed based on the number
of outstanding requests for a specific scheduler key, which is divided by the size of the
> In case of applications that request containers in big batches (e.g., traditional MR
jobs), and for relatively small clusters, the localityWaitFactor does not affect relaxing
locality much.
> However, in case of applications that request containers in small batches, this load
factor takes a very small value, which leads to assigning off-switch containers too soon.
This situation is even more pronounced in big clusters.
> For example, if an application requests only one container per request, the locality
will be relaxed after a single missed scheduling opportunity.
> The purpose of this JIRA is to rethink the way we are relaxing locality for off-switch

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message