Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5524E200CDB for ; Sat, 5 Aug 2017 17:50:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5357B16517C; Sat, 5 Aug 2017 15:50:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 98F5116517B for ; Sat, 5 Aug 2017 17:50:08 +0200 (CEST) Received: (qmail 56252 invoked by uid 500); 5 Aug 2017 15:50:07 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 56240 invoked by uid 99); 5 Aug 2017 15:50:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Aug 2017 15:50:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 27B44C3103 for ; Sat, 5 Aug 2017 15:50:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Hnno5fyuiGfK for ; Sat, 5 Aug 2017 15:50:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id A6C435FB6A for ; Sat, 5 Aug 2017 15:50:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1F40BE087B for ; Sat, 5 Aug 2017 15:50:03 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 69C0123FFD for ; Sat, 5 Aug 2017 15:50:02 +0000 (UTC) Date: Sat, 5 Aug 2017 15:50:02 +0000 (UTC) From: "Alexander Shraer (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ZOOKEEPER-2865) Reconfig Causes Inconsistent Configuration file among the nodes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 05 Aug 2017 15:50:09 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115434#comment-16115434 ] Alexander Shraer commented on ZOOKEEPER-2865: --------------------------------------------- This sounds equivalent to a situation where Server 3 starts later with an initial config that makes it impossible for it to reach any of the servers. We don't check that all servers have connected or synced before allowing the reconfig to proceed - only that a quorum is up and connected. So yes, restarting the server with a config allowing it to reach at least one other server may be required, but I don't think that this is a bug. > Reconfig Causes Inconsistent Configuration file among the nodes > --------------------------------------------------------------- > > Key: ZOOKEEPER-2865 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2865 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server > Affects Versions: 3.5.3 > Reporter: Jeffrey F. Lukman > Attachments: ZK-2865.pdf > > > When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3 > by following the workload in ZK-2778: > - initially start 2 ZooKeeper nodes > - start 3 new nodes > - do a reconfiguration (the complete reconfiguration is attached in the document) > We think our DMCK found this following bug: > - while one of the just joined nodes has not received the latest configuration update > (called as node X), the initial leader node closed its port, > therefore causing the node X to be isolated. > For complete information of the bug, please see the document that is attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)