Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 72BF7200C5C for ; Thu, 16 Mar 2017 06:05:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7165D160B78; Thu, 16 Mar 2017 05:05:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C41CB160B7C for ; Thu, 16 Mar 2017 06:05:46 +0100 (CET) Received: (qmail 71424 invoked by uid 500); 16 Mar 2017 05:05:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 71413 invoked by uid 99); 16 Mar 2017 05:05:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Mar 2017 05:05:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 4D85AC0370 for ; Thu, 16 Mar 2017 05:05:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.651 X-Spam-Level: X-Spam-Status: No, score=0.651 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 8S-bcbVyv162 for ; Thu, 16 Mar 2017 05:05:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id A21D05FDE0 for ; Thu, 16 Mar 2017 05:05:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D8A17E05C1 for ; Thu, 16 Mar 2017 05:05:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9933E254B8 for ; Thu, 16 Mar 2017 05:05:41 +0000 (UTC) Date: Thu, 16 Mar 2017 05:05:41 +0000 (UTC) From: "Carlo Curino (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-6344) Rethinking OFF_SWITCH locality in CapacityScheduler MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 16 Mar 2017 05:05:47 -0000 [ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927509#comment-15927509 ] Carlo Curino commented on YARN-6344: ------------------------------------ I agree with what [~kkaranasos] said. In our clusters, the localityWaitFactor (as it is today) it almost never leads to a reasonable behavior. For example, in a 5k nodes clusters, a very large job with 10k outstanding asks will only get to wait 2 (or up to 4) scheduling opportunities before giving up on the rack and going for off-switch. The change [~kkaranasos] is proposing looked reasonable (he will share the code soon). We have been flighting it in tests clusters with good results, and will be running it in prod in the coming days. I think we could probably retain the current behavior if rack-locality-delay is not specified, but in most scenarios is equivalent to say "we don't care about locality unless the job is many times bigger than the cluster" in which case, we might just remove a bunch of code from RM. Am I missing something? > Rethinking OFF_SWITCH locality in CapacityScheduler > --------------------------------------------------- > > Key: YARN-6344 > URL: https://issues.apache.org/jira/browse/YARN-6344 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Reporter: Konstantinos Karanasos > > When relaxing locality from node to rack, the {{node-locality-parameter}} is used: when scheduling opportunities for a scheduler key are more than the value of this parameter, we relax locality and try to assign the container to a node in the corresponding rack. > On the other hand, when relaxing locality to off-switch (i.e., assign the container anywhere in the cluster), we are using a {{localityWaitFactor}}, which is computed based on the number of outstanding requests for a specific scheduler key, which is divided by the size of the cluster. > In case of applications that request containers in big batches (e.g., traditional MR jobs), and for relatively small clusters, the localityWaitFactor does not affect relaxing locality much. > However, in case of applications that request containers in small batches, this load factor takes a very small value, which leads to assigning off-switch containers too soon. This situation is even more pronounced in big clusters. > For example, if an application requests only one container per request, the locality will be relaxed after a single missed scheduling opportunity. > The purpose of this JIRA is to rethink the way we are relaxing locality for off-switch assignments. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org