From dev-return-68620-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Thu Apr 5 21:57:24 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5988418063B for ; Thu, 5 Apr 2018 21:57:24 +0200 (CEST) Received: (qmail 93441 invoked by uid 500); 5 Apr 2018 19:57:23 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 93424 invoked by uid 99); 5 Apr 2018 19:57:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Apr 2018 19:57:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 1F6C9C00DB for ; Thu, 5 Apr 2018 19:57:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_NUMSUBJECT=0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id BfHaGSTmtIPA for ; Thu, 5 Apr 2018 19:57:20 +0000 (UTC) Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B13B55F189 for ; Thu, 5 Apr 2018 19:57:19 +0000 (UTC) Received: by mail-wm0-f45.google.com with SMTP id p9so8632424wmc.3 for ; Thu, 05 Apr 2018 12:57:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=9AJSt9/xIn5hApzMc9A3P9qeTkFNthuoaUA7yM7ege0=; b=Mh+bys5A0lj1f8Em1xkT5HxYQDMFJbGVzGT6pY6AE36u2G2ue+rJa921skrOgsjUeP n/Hu259L0Gyv78BFuHO55mbjkVwlYluoN0oWmZ95bYc7G9NJS55OZQKi6b7A/OkA7Rc4 GaHuxge/iNC2KI6Jnju3PokbaC399kZiDC+BfvVRVLco92h3GfeQ6z/Rg36GSjxbr+aT +Z0+ZFzbfW+QMDEBwEHQmojVBPlMG42erjhvyHeYfRk6N1l1K4aqI5JSL8so7wU+w8yc ghkfrP1zz9j2cqV6eBe9oD3DOvz58L42R5jZw0rXFb1U4gpR4YHw0mAvn6+KMoWc9vur wJcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=9AJSt9/xIn5hApzMc9A3P9qeTkFNthuoaUA7yM7ege0=; b=BkE0euM2+V3fW5Bxb03z11idZ0FEklztxgfmpInHNOXFsQEFLT6gx3fxYYHy0s57Ee 3fke9f1enJBGCuBFk/hvPg9ekICS6ef5DrhqCwQkXsUbZQ9P2uaJ/4siyqrZzqeSRGr8 VFbuIHwJ5NtKwL3rWYHLmdd4tkcm9cAZjzOsMbmue8Nq/J7b4qxuT5K5p0hxEJtYt+fo Jdq9EDqUqNSA7LfXtRSoZuCMI4fkGixpIb/zF88NcKa07UC9TVObmxLWjzQEJBKdqP+h EaSY2ubcfbp82swY+gzlkixtr6O8yrlxaqPiwp/GyCsxOu7B9OWr/SPAwZiqEW1ZTcJq wTsg== X-Gm-Message-State: ALQs6tDg/4CVSEReO56D7rf2BCC1PxepMCKUmwpLNAaaMPALMeAkm0FY nqZopuMM+pRHdR17OqeStrw5HLgoup5fBH87yjw= X-Google-Smtp-Source: AIpwx4+V5BeqFh+T41WloW/ut4TLx0HJDmMb8axFKxjVvQcr1bcVCMCP5yrKtDa7C9uJCPocK33KAkAkOG7iaP5RG7w= X-Received: by 10.80.153.9 with SMTP id k9mr4165929edb.303.1522958239170; Thu, 05 Apr 2018 12:57:19 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Alexander Shraer Date: Thu, 05 Apr 2018 19:57:08 +0000 Message-ID: Subject: Re: [VOTE] Apache ZooKeeper release 3.4.12 candidate 1 To: dev@zookeeper.apache.org Content-Type: multipart/alternative; boundary="f403045c1a6ada846c05691f5826" --f403045c1a6ada846c05691f5826 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Btw we actually observed the described issue (data loss), thankfully in a test environment. So I thought this is important to share with the community. Unfortunately I don=E2=80=99t have time to run a new ZK release for this, s= o I=E2=80=99m not going to -1 your candidate, but we are actively working on a fix (ie a test at this point) and I can commit that as soon as we have that. It may be worth while to delay the release by a few more days, but it=E2=80= =99s totally up to you since you=E2=80=99re running it. Cheers Alex On Thu, Apr 5, 2018 at 12:47 PM Andor Molnar wrote: > Got that. I still believe it's a completely valid issue which has to be > addressed, but it's not a showstopper. I'm afraid we're not going to > convince each other, so it's probably Abe's call if he want to create > another release candidate for the fix. > > I reviewed the code on github and I think it just needs to be covered wit= h > a unit test to be complete. > > Regards, > Andor > > > > On Thu, Apr 5, 2018 at 9:05 PM, Alexander Shraer > wrote: > > > Yes sort of, FLE is finished, then enough observer's messages reach the > > leader before participant's messages do. > > Whether its rare depends on the number of observers and participants. F= or > > example with very few participants and many observers > > your chance of hitting this are quite high. > > > > Alex > > > > On Thu, Apr 5, 2018 at 11:44 AM, Andor Molnar > wrote: > > > > > Maybe I'm missing something here, but this looks like a rare edge cas= e > to > > > me. Participants must finish the leader election successfully and rig= ht > > > after enough followers should fail to send epoch to the leader, so > > > observers can take it over. > > > > > > Is that description accurate? > > > > > > Andor > > > > > > > > > On Thu, Apr 5, 2018 at 7:35 PM, Alexander Shraer > > > wrote: > > > > > > > To clarify - in a deployment with observers this bug can potentiall= y > > > cause > > > > data loss. A server could be elected leader based just on the suppo= rt > > of > > > > observers, even if this servers data is stale wrt other followers. > > > > > > > > It is certainly a blocker, just not sure if for 3.4.11 or 3.4.12. > > > > > > > > > > > > Alex > > > > On Thu, Apr 5, 2018 at 10:29 AM Andor Molnar > > wrote: > > > > > > > > > I don't think it's a blocker. > > > > > The jira and PR has been open since last December and 3.4.11 has > > > released > > > > > without it. > > > > > > > > > > Although this bug is also important to fix, I believe it's more > > > important > > > > > to release a fix for the regression we've found in 3.4.11 asap. > > > > > > > > > > Abe, any thoughts? > > > > > > > > > > Regards, > > > > > Andor > > > > > > > > > > > > > > > > > > > > On Thu, Apr 5, 2018 at 7:00 PM, Alexander Shraer < > shralex@gmail.com> > > > > > wrote: > > > > > > > > > > > Sorry for coming in at the last moment. I'm not sure when the > next > > > 3.4 > > > > > > release is scheduled, so just wanted to mention this bug, > > > > > > which I believe is a blocker for either this or next release: > > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-2959 > > > > > > > > > > > > Best, > > > > > > Alex > > > > > > > > > > > > On Thu, Apr 5, 2018 at 9:09 AM, Ted Yu > > wrote: > > > > > > > > > > > > > Can the vote be closed ? > > > > > > > > > > > > > > It seems we have enough +1's > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > --f403045c1a6ada846c05691f5826--