Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Sun, 2 Apr 2017 19:01:41 +0000 (UTC)
From: "Arun Suresh (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13056409.1489606658000.187052.1491159701570@Atlassian.JIRA>
In-Reply-To: <JIRA.13056409.1489606658000@Atlassian.JIRA>
References: <JIRA.13056409.1489606658000@Atlassian.JIRA> <JIRA.13056409.1489606658411@jira-lw-us.apache.org>
Subject: [jira] [Commented] (YARN-6344) Rethinking OFF_SWITCH locality in
 CapacityScheduler
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Sun, 02 Apr 2017 19:01:46 -0000


    [ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952813#comment-15952813 ] 

Arun Suresh commented on YARN-6344:
-----------------------------------

[~kkaranasos], the patch LGTM. I like the idea of adding the rack locality delay on top of the node locality delay.

Minor nit: Looks like you the imports have been re-ordered (do try to fix it... since it might lead to merge conflicts later)


> Rethinking OFF_SWITCH locality in CapacityScheduler
> ---------------------------------------------------
>
>                 Key: YARN-6344
>                 URL: https://issues.apache.org/jira/browse/YARN-6344
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Konstantinos Karanasos
>            Assignee: Konstantinos Karanasos
>         Attachments: YARN-6344.001.patch, YARN-6344.002.patch, YARN-6344.003.patch
>
>
> When relaxing locality from node to rack, the {{node-locality-parameter}} is used: when scheduling opportunities for a scheduler key are more than the value of this parameter, we relax locality and try to assign the container to a node in the corresponding rack.
> On the other hand, when relaxing locality to off-switch (i.e., assign the container anywhere in the cluster), we are using a {{localityWaitFactor}}, which is computed based on the number of outstanding requests for a specific scheduler key, which is divided by the size of the cluster. 
> In case of applications that request containers in big batches (e.g., traditional MR jobs), and for relatively small clusters, the localityWaitFactor does not affect relaxing locality much.
> However, in case of applications that request containers in small batches, this load factor takes a very small value, which leads to assigning off-switch containers too soon. This situation is even more pronounced in big clusters.
> For example, if an application requests only one container per request, the locality will be relaxed after a single missed scheduling opportunity.
> The purpose of this JIRA is to rethink the way we are relaxing locality for off-switch assignments.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org