Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F1FD0200CC1 for ; Mon, 10 Jul 2017 19:40:12 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F07F8163D73; Mon, 10 Jul 2017 17:40:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 20838163D70 for ; Mon, 10 Jul 2017 19:40:11 +0200 (CEST) Received: (qmail 78069 invoked by uid 500); 10 Jul 2017 17:40:11 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 78058 invoked by uid 99); 10 Jul 2017 17:40:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Jul 2017 17:40:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id CEC9F193E4D for ; Mon, 10 Jul 2017 17:40:10 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id lQr0j8e-5Hyr for ; Mon, 10 Jul 2017 17:40:09 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 923FA624AB for ; Mon, 10 Jul 2017 17:40:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id CE67DE0D7B for ; Mon, 10 Jul 2017 17:40:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2E4EB2469A for ; Mon, 10 Jul 2017 17:40:00 +0000 (UTC) Date: Mon, 10 Jul 2017 17:40:00 +0000 (UTC) From: "Yufei Gu (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References:

Subject: [jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 10 Jul 2017 17:40:13 -0000 [ https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-6793: --------------------------- Description: There is a delay between preemption happen and containers are killed. If some resources released from nodes which are supposed to be preempted at that time are not enough for the resource request, reservation happens again at that node. E.g. scheduler reserves in node 1 for app 1. It will take 15s by default to kill containers in node 1 for fulfill that resource requests. If was released from node 1 before the killing, scheduler reserves again in node 1 for app1. was: There is a delay between preemption happen and containers are killed. If some resources released from nodes which are supposed to be preempted at that time are not enough for the resource request, reservation happens again at that node. E.g. scheduler reserves in node 1 for app 1. It will take 15s by default to kill containers in node 1 for fulfill that resource requests. If released from node 1 before the killing, scheduler reserve for app1 again in node 1. > Duplicated reservation in Fair Scheduler preemption > ---------------------------------------------------- > > Key: YARN-6793 > URL: https://issues.apache.org/jira/browse/YARN-6793 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.8.1, 3.0.0-alpha3 > Reporter: Yufei Gu > Assignee: Yufei Gu > Priority: Critical > > There is a delay between preemption happen and containers are killed. If some resources released from nodes which are supposed to be preempted at that time are not enough for the resource request, reservation happens again at that node. > E.g. scheduler reserves in node 1 for app 1. It will take 15s by default to kill containers in node 1 for fulfill that resource requests. If was released from node 1 before the killing, scheduler reserves again in node 1 for app1. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org