zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2953) Flaky Test: testNoLogBeforeLeaderEstablishment
Date Sat, 16 Dec 2017 00:32:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293475#comment-16293475

ASF GitHub Bot commented on ZOOKEEPER-2953:

Github user phunt commented on a diff in the pull request:

    --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
    @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception {
                     output[0], 2);
    +    /**
    +     * This test validates that if a quorum member determines that it is leader without
the support of the rest of the
    +     * quorum (the other members do not believe it to be the leader) it will stop attempting
to lead and become a follower.
    +     *
    +     * @throws IOException
    +     * @throws InterruptedException
    +     */
    +    @Test
    +    public void testElectionFraud() throws IOException, InterruptedException {
    +        // capture QuorumPeer logging
    +        Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout();
    +        ByteArrayOutputStream os = new ByteArrayOutputStream();
    +        WriterAppender appender = new WriterAppender(layout, os);
    +        appender.setThreshold(Level.INFO);
    +        Logger qlogger = Logger.getLogger(QuorumPeer.class);
    +        qlogger.addAppender(appender);
    +        numServers = 3;
    +        // used for assertions later
    +        boolean foundLeading = false;
    +        boolean foundLooking = false;
    +        boolean foundFollowing = false;
    +        try {
    +          // spin up a quorum, we use a small ticktime to make the test run faster
    +          servers = LaunchServers(numServers, 500);
    --- End diff --
    Note that by reducing this you are also affecting the init and sync limits  in the same
proportion... Not a reason not to do it but FYI in case we start seeing this test as flakey
down the road. :-)

> Flaky Test: testNoLogBeforeLeaderEstablishment
> ----------------------------------------------
>                 Key: ZOOKEEPER-2953
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2953
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.5.3, 3.4.11, 3.6.0
>            Reporter: Abraham Fine
>            Assignee: Abraham Fine
> testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for quite awhile.
My understanding is that the purpose of the test is to make sure that a server receives support
from the quorum before changing the epoch and acting as leader. 
> There are a couple issues with the test in its current state. First, the assertions the
test makes are not always true. It is possible, if the zookeeper database is not cleared,
for a follower to be ahead of a leader when the quorum is shutdown. That follower will then
likely become leader when the quorum is restarted. This is the cause of the flaky behavior.
Second, the test does not appear to create the conditions it wants to test for. Since, ZOOKEEPER-335
(specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration in {{FastLeaderElection}}
so the test no longer "believes it is the leader once it recovers".
> After discussing the issue offline with [~phunt] we decided it would still be valuable
to test the situation where a server is elected leader without the support of the quorum.
So I removed {{testNoLogBeforeLeaderEstablishment}} and created a new test called {{testElectionFraud}}.

This message was sent by Atlassian JIRA

View raw message