Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1F2D4200BFB for ; Tue, 27 Dec 2016 18:42:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1E005160B31; Tue, 27 Dec 2016 17:42:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6E784160B1D for ; Tue, 27 Dec 2016 18:41:59 +0100 (CET) Received: (qmail 5868 invoked by uid 500); 27 Dec 2016 17:41:58 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 5835 invoked by uid 99); 27 Dec 2016 17:41:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Dec 2016 17:41:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6820C2C1F5A for ; Tue, 27 Dec 2016 17:41:58 +0000 (UTC) Date: Tue, 27 Dec 2016 17:41:58 +0000 (UTC) From: "Karthik Kambatla (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-6021) When your allocated minShare of all queue`s added up exceed cluster capacity you can get some queue for 0 fairshare MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 27 Dec 2016 17:42:00 -0000 [ https://issues.apache.org/jira/browse/YARN-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780867#comment-15780867 ] Karthik Kambatla commented on YARN-6021: ---------------------------------------- Excuse the long-winded response. I believe minshare was originally introduced to handle a queue's *urgent* requirement on a *saturated* cluster: # When preemption is enabled, mishare worth of resources are preempted from other queues. This was necessary because fairshare preemption was very rigid. Since then, we have augmented fairshare preemption with a threshold and timeout giving more control to the admins. I would encourage trying these new controls out instead of using minshare preemption. # When preemption is not enabled, setting minshare for a queue forcibly sets the fairshare of the queue to at least that value. Using minshare makes sense only when used for special cases. In a cluster where most queues have a minshare set, there is no more *fairness*. Also, minshare is an absolute value and needs to be updated as the cluster grows/shrinks. For these reasons, I would discourage the use of minshare. At Cloudera, we discourage our customers too. There are exceptions: a high-priority, latency-sensitive workload that needs at least {{x}} resources to start. In your example, I think either minshares are being abused or the cluster is too small. If all the queues require at least those many resources to be functional, clearly the cluster cannot accommodate all of them coming together. PS: If backward compatibility were not important, I would have advocated for removing minshare altogether. > When your allocated minShare of all queue`s added up exceed cluster capacity you can get some queue for 0 fairshare > ------------------------------------------------------------------------------------------------------------------- > > Key: YARN-6021 > URL: https://issues.apache.org/jira/browse/YARN-6021 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.4.0 > Reporter: Feng Yuan > Assignee: Feng Yuan > Priority: Critical > > In fair-scheduler.xml,If you config the minshare add up exceed parentQueue`s fairshare,for root`s childs,fairshare is cluster capacity. > You will found your R value look like below when compute childs fairshares: > 1.0 > 0.5 > 0.25 > 0.125 > 0.0625 > 0.03125 > 0.015625 > 0.0078125 > 0.00390625 > I find this is due to: > double rMax = 1.0; > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > because resourceUsedWithWeightToResourceRatio will add minShare together. > As i think is really should we bring in minShare when compute fairshare? > My advice is we just consider weight is enough,and minshare's guarantee > will get fulfill when assginContainer! > Hope suggestion! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org