Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 94079200C1B for ; Tue, 14 Feb 2017 23:34:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 92C29160B45; Tue, 14 Feb 2017 22:34:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DB6B8160B6A for ; Tue, 14 Feb 2017 23:34:46 +0100 (CET) Received: (qmail 31787 invoked by uid 500); 14 Feb 2017 22:34:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 31771 invoked by uid 99); 14 Feb 2017 22:34:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2017 22:34:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 70A7FC76AD for ; Tue, 14 Feb 2017 22:34:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.198 X-Spam-Level: X-Spam-Status: No, score=-1.198 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 5PqeAqebJiFm for ; Tue, 14 Feb 2017 22:34:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 60DE15F570 for ; Tue, 14 Feb 2017 22:34:44 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 192CEE073D for ; Tue, 14 Feb 2017 22:34:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id BEE4624121 for ; Tue, 14 Feb 2017 22:34:41 +0000 (UTC) Date: Tue, 14 Feb 2017 22:34:41 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-6163) FS Preemption is a trickle for severely starved applications MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Feb 2017 22:34:47 -0000 [ https://issues.apache.org/jira/browse/YARN-6163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866841#comment-15866841 ] ASF GitHub Bot commented on YARN-6163: -------------------------------------- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/192#discussion_r101159725 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java --- @@ -1147,24 +1147,32 @@ private static boolean checkAndMarkRRVisited( * starvation. */ List getStarvedResourceRequests() { + // List of RRs we build in this method to return List ret = new ArrayList<>(); + + // Track visited RRs to avoid the same RR at multiple locality levels Map> visitedRRs= new HashMap<>(); + // Start with current starvation and track the pending amount Resource pending = getStarvation(); for (ResourceRequest rr : appSchedulingInfo.getAllResourceRequests()) { if (Resources.isNone(pending)) { + // Found enough RRs to match the starvation break; } + + // See if we have already seen this RR if (checkAndMarkRRVisited(visitedRRs, rr)) { continue; } - // Compute the number of containers of this capability that fit in the - // pending amount + // A RR can have multiple containers of a capability. We need to + // compute the number of containers that fit in "pending". int ratio = (int) Math.floor( --- End diff -- Given that ratio is the number of containers that fit in "pending," ratio is probably a bad name. That was a good chunk of my initial confusion. > FS Preemption is a trickle for severely starved applications > ------------------------------------------------------------ > > Key: YARN-6163 > URL: https://issues.apache.org/jira/browse/YARN-6163 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler > Affects Versions: 2.9.0 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Attachments: yarn-6163-1.patch, yarn-6163-2.patch > > > With current logic, only one RR is considered per each instance of marking an application starved. This marking happens only on the update call that runs every 500ms. Due to this, an application that is severely starved takes forever to reach fairshare based on preemptions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org