Return-Path: X-Original-To: apmail-mesos-dev-archive@www.apache.org Delivered-To: apmail-mesos-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E036B19C4C for ; Tue, 8 Mar 2016 02:26:11 +0000 (UTC) Received: (qmail 84751 invoked by uid 500); 8 Mar 2016 02:26:11 -0000 Delivered-To: apmail-mesos-dev-archive@mesos.apache.org Received: (qmail 84667 invoked by uid 500); 8 Mar 2016 02:26:11 -0000 Mailing-List: contact dev-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list dev@mesos.apache.org Received: (qmail 84655 invoked by uid 99); 8 Mar 2016 02:26:11 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Mar 2016 02:26:11 +0000 Received: from mail-ob0-f179.google.com (mail-ob0-f179.google.com [209.85.214.179]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 0A6D61A0315 for ; Tue, 8 Mar 2016 02:26:11 +0000 (UTC) Received: by mail-ob0-f179.google.com with SMTP id fp4so1651587obb.2 for ; Mon, 07 Mar 2016 18:26:11 -0800 (PST) X-Gm-Message-State: AD7BkJKMZUKiuX1PSAzobknUehU4FasuMJfwv9OH4NQsiLeBcGQwMPft3Xr4I0gy/yq9nBxjfD5syzGGKt1HA0T2 X-Received: by 10.182.153.10 with SMTP id vc10mr15859664obb.10.1457403970477; Mon, 07 Mar 2016 18:26:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.76.39.2 with HTTP; Mon, 7 Mar 2016 18:25:50 -0800 (PST) In-Reply-To: References: From: Benjamin Mahler Date: Mon, 7 Mar 2016 18:25:50 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: State of registrar To: dev Content-Type: multipart/alternative; boundary=089e013a0996f47aa7052d804db0 --089e013a0996f47aa7052d804db0 Content-Type: text/plain; charset=UTF-8 Apologies for the long delay. I wouldn't call it experimental (that comment is stale), you should feel free to turn on strictness. Strictness enforces that agents that were removed by an old master cannot re-join with a new master. This preserves the steady state behavior: if the master removes an agent, it does not allow it to return. Ideally, the flag is removed and strictness is the default, but we didn't feel comfortable removing it until we had state backup support in the master. Turning off strictness allows for an escape hatch if state is lost. Now that we are persisting more information than just the list of agents, this escape hatch doesn't restore the other state (like maintenance schedules, quota information, etc). As for why it's not on by default today, we found that many frameworks, like Aurora and Marathon, are capable of handling a removed agent re-surfacing in the cluster and so it wasn't critical to turn this on. Also, we also realized that we need to re-work the partition handling in Mesos in order to give frameworks the control over how to react to an unreachable agent. Does that clarify things? On Mon, Feb 1, 2016 at 11:04 AM, Zhitao Li wrote: > Hi, > > I've been reading related documentation on Mesos website and trying to > understand the current status of registrar. > > I noticed that we still consider "--registrar_strict" as experimental, but > I can't find the back story of what's needed to finish the project or the > JIRA so tracking that. > > Also, does anyone have recommendations on whether we should turn this flag > on, and what benefits cluster operator would get? > > Thanks. --089e013a0996f47aa7052d804db0--