Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8E8772004F5 for ; Fri, 1 Sep 2017 15:35:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8D05E16D031; Fri, 1 Sep 2017 13:35:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D40E916D02F for ; Fri, 1 Sep 2017 15:35:23 +0200 (CEST) Received: (qmail 64978 invoked by uid 500); 1 Sep 2017 13:35:22 -0000 Mailing-List: contact dev-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list dev@mesos.apache.org Received: (qmail 64959 invoked by uid 99); 1 Sep 2017 13:35:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2017 13:35:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id ECCE6C3861 for ; Fri, 1 Sep 2017 13:35:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.479 X-Spam-Level: ** X-Spam-Status: No, score=2.479 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=twopensource-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id eIXAMFrfO32t for ; Fri, 1 Sep 2017 13:35:15 +0000 (UTC) Received: from mail-ua0-f170.google.com (mail-ua0-f170.google.com [209.85.217.170]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 851865F1EE for ; Fri, 1 Sep 2017 13:35:15 +0000 (UTC) Received: by mail-ua0-f170.google.com with SMTP id g47so714382uad.0 for ; Fri, 01 Sep 2017 06:35:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=twopensource-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=f7xq8xeQ5M30ySQls5sma/ABPe4H0i/CeJTZcbShM7c=; b=gLOHwV8lsTaawHjMZ5c7Snc8joYmJN8snUv+viNzPQswkPLEFFaWo0K5jSTQLGjtN+ osXNcMmAshUESHyEAx2VrdxA/GY7i/FbBIjpU1TTtIx5p4Y4LJWgr1BWS/pCZ8sCreRR mgxyxog776qiCmwPAjc+WdQEDFuPyx3CtmF42OvtvHDCh4T6JIFXyBOsLnzOOX36EqC9 u2yZ9yu07FoOvIgsr3VZUOS2fWAwJu1VgilMUXTUi8MwNesurhWZ28rXvwoTVx4EsSW8 2InPyeTvDxRr0xp71jKgLdoZTUVTzaL2/sD4+OOdFGLomPeNkQWhr3thGYMhq5yqhJNY 6auw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=f7xq8xeQ5M30ySQls5sma/ABPe4H0i/CeJTZcbShM7c=; b=ISLAY8o211O42/Hy37ZVYLQwjV6LpIkREXNEeZlytz0oci3mVjRhrsZ4PT4WsInkey 6+HosDlA2kH9MJ43rP0wXHqx/yIE8cxg5lA6lZbzvfYwvw+Uo22N/6s+maKWxERB/d9f O6fVX4NxObPA+ZQstexZROqcA6xi06pkJoggJUUvG0jIWgJeQJppY/yHxPMxj0hhPtYS omnyvASqlwMK6qwUScakT1zc0LBva5QaFig6PU9eJe/4twyWdS0KM7Bn1NiEtt7sOBuN QX4DgB9PoEPHbZQlQvBJGvGlM1y+6A3j05QhyfdqfZpohUgHWuuaJr0PvKyNGzULti4l 4l+g== X-Gm-Message-State: AHPjjUjDdlo5iNaRCSI/1FF+dQWhqSHAax5+4IWk0YKFrgKl2CMcTyBr YGh+M3s9R9GIxDASQRE= X-Received: by 10.176.95.137 with SMTP id b9mr1261906uaj.148.1504272914200; Fri, 01 Sep 2017 06:35:14 -0700 (PDT) Received: from mail-vk0-f47.google.com (mail-vk0-f47.google.com. [209.85.213.47]) by smtp.gmail.com with ESMTPSA id g32sm28696uad.52.2017.09.01.06.35.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Sep 2017 06:35:13 -0700 (PDT) Received: by mail-vk0-f47.google.com with SMTP id x85so629379vkx.5 for ; Fri, 01 Sep 2017 06:35:12 -0700 (PDT) X-Google-Smtp-Source: ADKCNb769bzuwq+lWDjGVCiEL2zMnpVkVXrYYFG8stHd4XufLw4uDRNcAajdbxZmCj/Go/yYQkojFIysC1podIl8r8g= X-Received: by 10.31.95.19 with SMTP id t19mr1210433vkb.172.1504272912597; Fri, 01 Sep 2017 06:35:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.54.165 with HTTP; Fri, 1 Sep 2017 06:35:12 -0700 (PDT) In-Reply-To: References: From: Ilya Pronin Date: Fri, 1 Sep 2017 14:35:12 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: recovery_agent_removal_limit usage question? To: dev@mesos.apache.org Content-Type: multipart/alternative; boundary="001a114e393299f5c8055820d44f" archived-at: Fri, 01 Sep 2017 13:35:24 -0000 --001a114e393299f5c8055820d44f Content-Type: text/plain; charset="UTF-8" Hey, I'm not sure I understood your question correctly. But AFAIK recovery_agent_removal_limit flag is intended to limit the number of agents that will be marked unreachable after the re-registration timeout. If the master sees that it has to remove more agents than the limit allows, it will failover. Otherwise, agents that have not yet re-registered will be marked unreachable at slave_removal_rate_limit. Here's the code that does that: https://github.com/apache/mesos/blob/master/src/master/master.cpp#L1946 We no longer shutdown agents if they try to re-register after being marked unreachable, so we can safely remove those agents from the registry. However, it still might be a good signal for the operator to investigate why a lot of agents did not re-register. On Fri, Sep 1, 2017 at 6:46 AM, tommy xiao wrote: > toady i have a curious to read mesos source code for > --recovery_agent_removal_limit. how does it working from source code. i > have not found any useful logic for recovery_agent_removal_limit. anyone > can do me favor? > > -- > Deshi Xiao > Twitter: xds2000 > E-mail: xiaods(AT)gmail.com > -- Ilya Pronin --001a114e393299f5c8055820d44f--