Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 95B59200D5A for ; Thu, 14 Dec 2017 17:45:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 94175160C16; Thu, 14 Dec 2017 16:45:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D8CEB160BFC for ; Thu, 14 Dec 2017 17:45:04 +0100 (CET) Received: (qmail 62137 invoked by uid 500); 14 Dec 2017 16:45:03 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 62126 invoked by uid 99); 14 Dec 2017 16:45:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Dec 2017 16:45:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 5D2661807A1 for ; Thu, 14 Dec 2017 16:45:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Zc2OLCZxYo7H for ; Thu, 14 Dec 2017 16:45:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 3B3B05F39F for ; Thu, 14 Dec 2017 16:45:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 4B822E04F4 for ; Thu, 14 Dec 2017 16:45:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0EF06212FA for ; Thu, 14 Dec 2017 16:45:00 +0000 (UTC) Date: Thu, 14 Dec 2017 16:45:00 +0000 (UTC) From: "Abraham Fine (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ZOOKEEPER-2953) Flaky Test: testNoLogBeforeLeaderEstablishment MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 14 Dec 2017 16:45:05 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abraham Fine updated ZOOKEEPER-2953: ------------------------------------ Description: testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for quite awhile. My understanding is that the purpose of the test is to make sure that a server receives support from the quorum before changing the epoch and acting as leader. There are a couple issues with the test in its current state. First, the assertions the test makes are not always true. It is possible, if the zookeeper database is not cleared, for a follower to be ahead of a leader when the quorum is shutdown. That follower will then likely become leader when the quorum is restarted. This is the cause of the flaky behavior. Second, the test does not appear to create the conditions it wants to test for. Since, ZOOKEEPER-335 (specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration in {{FastLeaderElection}} so the test no longer "believes it is the leader once it recovers". After discussing the issue offline with [~phunt] we decided it would still be valuable to test the situation where a server is elected leader without the support of the quorum. So I removed {{testNoLogBeforeLeaderEstablishment}} and created a new test called {{testElectionFraud}}. was: testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for quite awhile. My understanding is that the purpose of the test is to make sure that a server receives support from the quorum before changing the epoch and acting as leader. There are a couple issues with the test in its current state. First, the assertions the test makes are not always true. It is possible, if the zookeeper database is not cleared, for a follower to be ahead of a leader when the quorum is shutdown. That follower will then likely become leader when the quorum is restarted. This is the cause of the flaky behavior. Second, the test does not appear to create the conditions it wants to test for. Since, ZOOKEEPER-335 (specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration in {{FastLeaderElection}} so the test no longer "believes it is the leader once it recovers". After discussing the issue offline with [~phunt] we decided it would still be valuable to test the situation where a server is elected leader without the support of the quorum. So I removed > Flaky Test: testNoLogBeforeLeaderEstablishment > ---------------------------------------------- > > Key: ZOOKEEPER-2953 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2953 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.5.3, 3.4.11, 3.6.0 > Reporter: Abraham Fine > Assignee: Abraham Fine > > testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for quite awhile. My understanding is that the purpose of the test is to make sure that a server receives support from the quorum before changing the epoch and acting as leader. > There are a couple issues with the test in its current state. First, the assertions the test makes are not always true. It is possible, if the zookeeper database is not cleared, for a follower to be ahead of a leader when the quorum is shutdown. That follower will then likely become leader when the quorum is restarted. This is the cause of the flaky behavior. Second, the test does not appear to create the conditions it wants to test for. Since, ZOOKEEPER-335 (specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration in {{FastLeaderElection}} so the test no longer "believes it is the leader once it recovers". > After discussing the issue offline with [~phunt] we decided it would still be valuable to test the situation where a server is elected leader without the support of the quorum. So I removed {{testNoLogBeforeLeaderEstablishment}} and created a new test called {{testElectionFraud}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)