Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 658AD200C2A for ; Wed, 15 Feb 2017 00:41:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 642AB160B74; Tue, 14 Feb 2017 23:41:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 79559160B73 for ; Wed, 15 Feb 2017 00:41:50 +0100 (CET) Received: (qmail 76895 invoked by uid 500); 14 Feb 2017 23:41:49 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 76745 invoked by uid 99); 14 Feb 2017 23:41:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2017 23:41:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 08146C7D15 for ; Tue, 14 Feb 2017 23:41:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.198 X-Spam-Level: X-Spam-Status: No, score=-1.198 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id DUfAmhLoorQG for ; Tue, 14 Feb 2017 23:41:47 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id DB5D55FD92 for ; Tue, 14 Feb 2017 23:41:46 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 29E91E081D for ; Tue, 14 Feb 2017 23:41:45 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 190F02415C for ; Tue, 14 Feb 2017 23:41:43 +0000 (UTC) Date: Tue, 14 Feb 2017 23:41:43 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-6163) FS Preemption is a trickle for severely starved applications MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Feb 2017 23:41:51 -0000 [ https://issues.apache.org/jira/browse/YARN-6163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866938#comment-15866938 ] ASF GitHub Bot commented on YARN-6163: -------------------------------------- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/192#discussion_r101171746 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/VisitedResourceRequestTracker.java --- @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair; + +import org.apache.hadoop.yarn.api.records.Priority; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceRequest; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker; + +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Applications place {@link ResourceRequest}s at multiple levels. This is a + * helper class that allows tracking if a {@link ResourceRequest} has been + * visited at a different locality level. + * + * This is implemented for {@link FSAppAttempt#getStarvedResourceRequests()}. + * The implementation is not thread-safe. + */ +class VisitedResourceRequestTracker { + private final Map> map = + new HashMap<>(); + private final ClusterNodeTracker nodeTracker; + + VisitedResourceRequestTracker( + ClusterNodeTracker nodeTracker) { + this.nodeTracker = nodeTracker; + } + + /** + * Check if the {@link ResourceRequest} is visited before, and track it. + * @param rr {@link ResourceRequest} to visit + * @return true if rr this is the first visit across all + * locality levels, false otherwise + */ + boolean visit(ResourceRequest rr) { + Priority priority = rr.getPriority(); + Resource capability = rr.getCapability(); + + Map subMap = map.get(priority); + if (subMap == null) { + subMap = new HashMap<>(); + map.put(priority, subMap); + } + + TrackerPerPriorityResource tracker = subMap.get(capability); + if (tracker == null) { + tracker = new TrackerPerPriorityResource(); + subMap.put(capability, tracker); + } + + return tracker.visit(rr.getResourceName()); + } + + private class TrackerPerPriorityResource { + private Set racksWithNodesVisited = new HashSet<>(); + private Set racksVisted = new HashSet<>(); + private boolean anyVisited; + + private boolean visitAny() { + if (racksVisted.isEmpty() && racksWithNodesVisited.isEmpty()) { + anyVisited = true; + } + return anyVisited; + } + + private boolean visitRack(String rackName) { + if (anyVisited || racksWithNodesVisited.contains(rackName)) { + return false; + } else { + racksVisted.add(rackName); + return true; + } + } + + private boolean visitNode(String rackName) { + if (anyVisited || racksVisted.contains(rackName)) { + return false; + } else { + racksWithNodesVisited.add(rackName); + return true; + } + } + + private boolean visit(String resourceName) { + if (resourceName.equals(ResourceRequest.ANY)) { + return visitAny(); + } + + List nodes = + nodeTracker.getNodesByResourceName(resourceName); + switch (nodes.size()) { + case 0: + // Log error + return false; + case 1: + // Node + return visitNode(nodes.remove(0).getRackName()); --- End diff -- Seems like you could use a second check here to make sure that this isn't actually a 1-node rack, e.g. make sure the rack name isn't the resource name. What's worse, marking an unvisited request as visited or vice versa? > FS Preemption is a trickle for severely starved applications > ------------------------------------------------------------ > > Key: YARN-6163 > URL: https://issues.apache.org/jira/browse/YARN-6163 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler > Affects Versions: 2.9.0 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Attachments: yarn-6163-1.patch, yarn-6163-2.patch > > > With current logic, only one RR is considered per each instance of marking an application starved. This marking happens only on the update call that runs every 500ms. Due to this, an application that is severely starved takes forever to reach fairshare based on preemptions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org