Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF767174A0 for ; Fri, 1 May 2015 15:53:32 +0000 (UTC) Received: (qmail 86246 invoked by uid 500); 1 May 2015 15:53:32 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 86195 invoked by uid 500); 1 May 2015 15:53:32 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 86184 invoked by uid 99); 1 May 2015 15:53:32 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 May 2015 15:53:32 +0000 Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id EEE351A0449 for ; Fri, 1 May 2015 15:53:30 +0000 (UTC) Received: by lbbuc2 with SMTP id uc2so67123577lbb.2 for ; Fri, 01 May 2015 08:53:29 -0700 (PDT) X-Gm-Message-State: ALoCoQk2a45Xee5TZ+FEp1CBSP7iEi3wCCl2eD6F0ZAc6vqel7FbtLpiLHquJ6YxIlbXmPeAYIzc MIME-Version: 1.0 X-Received: by 10.152.26.230 with SMTP id o6mr5108271lag.7.1430495609315; Fri, 01 May 2015 08:53:29 -0700 (PDT) Received: by 10.112.134.231 with HTTP; Fri, 1 May 2015 08:53:29 -0700 (PDT) In-Reply-To: <5AC0C4FA4C08FE4798466CD7A2145B1016B5E922@gsdgeup01env1.firmwide.corp.gs.com> References: <1333241735.772264.1430329597990.JavaMail.yahoo@mail.yahoo.com> <0237AE9E-8B51-401B-A44F-061BD7F64C38@gmail.com> <3998DDDB-5153-4007-B8C6-1666D4E39D40@yahoo.com> <5AC0C4FA4C08FE4798466CD7A2145B1016B5E922@gsdgeup01env1.firmwide.corp.gs.com> Date: Fri, 1 May 2015 11:53:29 -0400 Message-ID: Subject: Re: Leader election duration From: Camille Fournier To: "bookkeeper-user@zookeeper.apache.org" Content-Type: multipart/alternative; boundary=089e0160a330a57b4805150736f7 --089e0160a330a57b4805150736f7 Content-Type: text/plain; charset=UTF-8 One thing that jumps out at me here is that a lot of these messages are from different rounds. Some say they are in round 6, and some say they are in round 7. 2015-04-25 12:02:20,585 - INFO [WorkerReceiver[myid=2]: FastLeaderElection@542] - Notification: 5 (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LEADING (n.state), 5 (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) 2015-04-25 12:02:23,723 - INFO [WorkerReceiver[myid=2]: FastLeaderElection@542] - Notification: 1 (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) So that's a bit odd. On Fri, May 1, 2015 at 10:26 AM, Dudzinski, Karol wrote: > Unfortunately, I can't reproduce this other than in one production cluster > so testing a patch is difficult. > > Below are the relevant logs that Camille requested. > > For the first issue, the leader selection taking a long time, this is a > typical example: > 2015-04-25 12:01:59,734 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :QuorumPeer@670] - LOOKING > 2015-04-25 12:01:59,737 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FileSnap@83] - Reading snapshot > 2015-04-25 12:02:05,203 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:05,327 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@740] - New election. My id = 2, proposed > zxid=0x41d000c7d27 > 2015-04-25 12:02:05,327 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:05,328 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:05,530 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@774] - Notification time out: 400 > 2015-04-25 12:02:05,940 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@774] - Notification time out: 800 > 2015-04-25 12:02:06,750 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@774] - Notification time out: 1600 > 2015-04-25 12:02:08,359 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@774] - Notification time out: 3200 > 2015-04-25 12:02:08,709 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:08,756 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:10,320 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:10,325 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), FOLLOWING (n.state), 4 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:13,539 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FastLeaderElection@774] - Notification time out: 6400 > 2015-04-25 12:02:13,707 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:13,755 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LEADING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:15,329 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:15,334 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), FOLLOWING (n.state), 4 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:15,582 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LEADING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:18,709 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 1 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:20,339 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:20,350 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), FOLLOWING (n.state), 4 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:20,585 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LEADING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:23,723 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 1 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:25,349 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:25,357 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), FOLLOWING (n.state), 4 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:25,598 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe6 (n.round), LEADING (n.state), 5 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:28,729 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:30,358 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:31,500 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:33,726 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:34,476 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:35,357 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:36,506 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:38,726 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:39,473 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:40,357 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 2 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:41,507 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:43,726 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:44,472 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:46,504 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:48,725 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41d (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:49,480 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:51,504 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:53,724 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 1 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:54,488 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 5 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:56,514 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 5 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 4 > (n.sid), 0x41e (n.peerEPoch), LOOKING (my state) > 2015-04-25 12:02:56,734 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :QuorumPeer@738] - FOLLOWING > 2015-04-25 12:02:56,734 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :ZooKeeperServer@162] - Created server with tickTime 2000 > minSessionTimeout 4000 maxSessionTimeout 100000 datadir snapdir > > 2015-04-25 12:02:56,734 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 56999 > > These are logs from server with id 2, they are following the reboot of the > leader which was sid 3. > > > For the second problem, where the sync'ing doesn't happen, here are the > follower logs (immediately after the election shown above): > 2015-04-25 12:02:56,818 - WARN [QuorumPeer[myid=2]/0.0.0.0:2181 > :Learner@232] - Unexpected exception, tries=0, connecting to id 5>/:2182 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > 2015-04-25 12:02:57,906 - WARN [QuorumPeer[myid=2]/0.0.0.0:2181 > :Learner@232] - Unexpected exception, tries=1, connecting to id 5>/:2182 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > at java.lang.Thread.run(Thread.java:662) > 2015-04-25 12:02:58,725 - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 4 > (n.leader), 0x41d000c7d27 (n.zxid), 0xe7 (n.round), LOOKING (n.state), 1 > (n.sid), > 0x41e (n.peerEPoch), FOLLOWING (my state) > 2015-04-25 12:02:58,996 - WARN [QuorumPeer[myid=2]/0.0.0.0:2181 > :Learner@232] - Unexpected exception, tries=2, connecting to id 5>/:2182 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > 2015-04-25 12:03:00,093 - WARN [QuorumPeer[myid=2]/0.0.0.0:2181 > :Learner@232] - Unexpected exception, tries=3, connecting to id 5>/:2182 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > at java.lang.Thread.run(Thread.java:662) > 2015-04-25 12:03:01,185 - ERROR [QuorumPeer[myid=2]/0.0.0.0:2181 > :Learner@229] - Unexpected exception > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > 2015-04-25 12:03:01,186 - WARN [QuorumPeer[myid=2]/0.0.0.0:2181 > :Follower@89] - Exception when following the leader > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:529) > at > org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > 2015-04-25 12:03:01,186 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :Follower@166] - shutdown called > java.lang.Exception: shutdown Follower > at > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) > 2015-04-25 12:03:01,187 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :FollowerZooKeeperServer@139] - Shutting down > 2015-04-25 12:03:01,187 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :ZooKeeperServer@419] - shutting down > 2015-04-25 12:03:01,187 - INFO [QuorumPeer[myid=2]/0.0.0.0:2181 > :QuorumPeer@670] - LOOKING > > I think they get rejected because as far as sid 5 is concerned, it's not > the leader yet as shown by these two lines: > 2015-04-25 12:02:27,247 - INFO [QuorumPeer[myid=5]/0.0.0.0:2181 > :QuorumPeer@670] - LOOKING > 2015-04-25 12:03:01,838 - INFO [QuorumPeer[myid=5]/0.0.0.0:2181 > :Leader@345] - LEADING - LEADER ELECTION TOOK - 34591 > > A short while later the leader times out: > 2015-04-25 12:03:21,850 - WARN [QuorumPeer[myid=5]/0.0.0.0:2181 > :QuorumPeer@756] - Unexpected exception > java.lang.InterruptedException: Timeout while waiting for epoch from quorum > at > org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:872) > at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:364) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:753) > > During another election, the leader sees this: > 2015-04-25 17:03:55,967 - INFO [QuorumPeer[myid=5]/0.0.0.0:2181 > :Leader@345] - LEADING - LEADER ELECTION TOOK - 34114 > 2015-04-25 17:04:06,034 - INFO > [LearnerHandler-/:34474:LearnerHandler@263] - Follower sid: 4 : info > : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@5c4b8582 > > Sometimes this includes all the expected followers, sometimes just one or > two. The problem doesn't end there, even for the follower for which it > prints the line above, it often never gets to the line below which I see in > successful elections (this is an example from a clean cluster I just spun > up for testing): > 2015-04-29 11:05:55,941 - INFO > [LearnerHandler-/:46737:LearnerHandler@385] - Synchronizing with > Follower sid: 1 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x0 > > Karol > > The Goldman Sachs Group, Inc. All rights reserved. > See http://www.gs.com/disclaimer/global_email for important risk > disclosures, conflicts of interest and other terms and conditions relating > to this e-mail and your reliance on information contained in it. This > message may contain confidential or privileged information. If you are not > the intended recipient, please advise us immediately and delete this > message. See http://www.gs.com/disclaimer/email for further information > on confidentiality and the risks of non-secure electronic communication. > If you cannot access these links, please notify us by reply message and we > will send the contents to you. > > > -----Original Message----- > From: Flavio Junqueira [mailto:fpjunqueira@yahoo.com.INVALID] > Sent: 29 April 2015 22:54 > To: user@zookeeper.apache.org > Subject: Re: Leader election duration > > 170MB isn't that large. I'm thinking that the election might be taking > some time because of the issue described in ZK-2164. In that jira, the > election was failing, but I can see the blocking call causing the election > to take longer as well. You may want to test that fix and see if it makes > any difference to you. > > -Flavio > > > On 29 Apr 2015, at 19:05, Karol Dudzinski > wrote: > > > > The snapshots are a bit larger than I'd like (~170MB) but the bit that > takes time is actually before that. > > > > On a successful election, on the new leader I see the following: > > - LEADER ELECTION TOOK: 12345 > > - Synchronizing with Follower... > > - Sending DIFF or SNAP > > > > That last bit will take a long time if the snapshot is large, agreed. > However, for me it's arriving at the LEADER ELECTION TOOK bit that takes a > long time. And the bit between that and the Synchronizing often times out > with the messages I described in my previous email. > > > > Karol > > > >> On 29 Apr 2015, at 18:46, Flavio Junqueira > wrote: > >> > >> Check the size of your snapshot files. If during election you need to > transfer snapshots and they are large, then recovery is going to take > longer. We try to send a diff of the txns when possible, and sending a diff > is much faster compared to send a full snapshot. Sending a diff isn't > always possible, depending on how far behind followers compared to the > leader, so the difference might be due to snapshots and diffs. > >> -Flavio > >> > >> > >> On Wednesday, April 29, 2015 6:32 PM, Camille Fournier < > camille@apache.org> wrote: > >> > >> > >> > >> Don't suppose you could share some snippets of the log messages > >> you're seeing a lot? > >> > >> On Wed, Apr 29, 2015 at 1:29 PM, Karol Dudzinski > >> > >> wrote: > >> > >>> Did a bit more investigation today though I didn't try the prod > >>> snapshot in a test cluster. I did however find a couple of other > >>> strange things which I'd initially missed. > >>> > >>> When the leader of this particular cluster goes for a reboot, I see > >>> many notification printouts from FastLeaderElection. On some > >>> occasions I've seen nearly 50. In other clusters I usually see 5 or > >>> thereabouts. The number of these notifications seems to correlate > >>> with how long election took. What could cause it to require > exchanging so many messages? > >>> > >>> This cluster also seems to often timeout while syncing after > >>> selecting the leader. On the new leader I see the expected number of > "Follower sid: ..." > >>> messages but it doesn't always get to the "Synchronizing with Follower" > >>> part. On the leader I then get a "Unexpected exception causing > >>> shutdown while sock still open" message followed by a GOODBYE for > >>> one of the followers. At the same time on that follower I see a > >>> "Exception when following the leader" caused by a read timeout in > >>> Learner.registerWithLeader. Does anything happen prior to syncing > >>> that could take a long time? > >>> > >>> Thanks, > >>> Karol > >>> > >>>>> On 28 Apr 2015, at 20:22, Karol Dudzinski > >>>>> > >>>> wrote: > >>>> > >>>> Well these are prod clusters so my ability to experiment is rather > >>> limited. I can take a copy of the snapshot and try both 3 node and > >>> 5 in a test cluster. > >>>> > >>>> One thing I forgot to mention is that in most clusters the number > >>>> of > >>> election notification log lines I see is typically, give or take, > >>> the same as the number of participants. In this cluster however, > >>> it's typically 2 or 3 times as many notifications as the number of > participants. > >>>> > >>>> My gut feeling is it's more likely to be due to load as the 5 node > >>> cluster is much busier and the election time has been increasing > >>> over time (as has load). I have no idea exactly what load though, > >>> whether it's number of clients, frequency of transactions, total > >>> data size, etc. I don't understand why though but that may just be > >>> my limited knowledge of the election protocol. > >>>> > >>>> Karol > >>>> > >>>>> On 28 Apr 2015, at 19:54, Camille Fournier > wrote: > >>>>> > >>>>> Just out of curiosity, if you start the 5 node cluster up with > >>>>> only 3 of the nodes to begin with (like, config 5, but only bring > >>>>> up 3 processes), does it speed up the leader election or is it still > slow? > >>>>> > >>>>> C > >>>>> > >>>>> On Tue, Apr 28, 2015 at 1:41 PM, Karol Dudzinski < > >>> karoldudzinski@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> We're seeing some rather strange leader election in one of our > >>> clusters. > >>>>>> The duration reported by the "FOLLOWING - LEADER ELECTION TOOK" > >>>>>> log > >>> line > >>>>>> (and equivalent for the leader) seems to vary hugely. During one > >>> rolling > >>>>>> reboot, I saw the number reported as small as 39ms and as large > >>>>>> as 57 seconds (difference in units is not a typo). The average > >>>>>> is just > >>> about 10 > >>>>>> seconds and std dev also about 10 seconds. So the time taken is > >>>>>> not > >>> only > >>>>>> quite large, it's also very variable. > >>>>>> > >>>>>> We have other clusters but the average election time in those is > >>>>>> in the hundreds of millis with std dev in a similar ballpark. I > >>>>>> guess one difference is the "slow" cluster is 5 participants > >>>>>> while the others > >>> are 3, > >>>>>> which may be a factor but I wouldn't expect it to make two orders > >>>>>> of magnitude difference! > >>>>>> > >>>>>> So my question is, what factors contribute to the election time > >>> reported > >>>>>> by these log lines? And what can we do to speed this up? > >>>>>> > >>>>>> As far as I understand from logs and a quick browse through the > >>>>>> code > >>> that > >>>>>> time is the time to select a leader. Syncing up to the leader > >>>>>> happens after that. The syncing part I can understand will vary > >>>>>> depending on > >>> load > >>>>>> but I don't see why selecting the leader would. > >>>>>> > >>>>>> Thanks, > >>>>>> Karol > >> > >> > >> > > --089e0160a330a57b4805150736f7--