From issues-return-866-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Tue Sep 3 17:21:02 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 30A58180674 for ; Tue, 3 Sep 2019 19:21:02 +0200 (CEST) Received: (qmail 24852 invoked by uid 500); 3 Sep 2019 19:48:21 -0000 Mailing-List: contact issues-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list issues@zookeeper.apache.org Received: (qmail 24841 invoked by uid 99); 3 Sep 2019 19:48:21 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Sep 2019 19:48:21 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id BFB59E30C4 for ; Tue, 3 Sep 2019 17:21:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2BD7C7806CB for ; Tue, 3 Sep 2019 17:21:00 +0000 (UTC) Date: Tue, 3 Sep 2019 17:21:00 +0000 (UTC) From: "Karolos Antoniadis (Jira)" To: issues@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (ZOOKEEPER-3478) Leader restart shuts down all the followers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ZOOKEEPER-3478?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 16919707#comment-16919707 ]=20 Karolos Antoniadis edited comment on ZOOKEEPER-3478 at 9/3/19 5:20 PM: ----------------------------------------------------------------------- Hi Ilijana, regarding the time between restarts: It's better to avoid any specific time= interval. You might want to do this instead: After shutting down server (e.g., 21) a= nd before bringing it up again, make sure that some other server (e.g., 20)= is the leader by issuing a [stat|[https://zookeeper.apache.org/doc/r3.5.5/= zookeeperAdmin.html]] 4wl command to it. Probably this is not bullet-proof = either, but better than waiting any specific amount of time. By the way, looking at the logs again, I notice that server 20 timed out wh= ile waiting for an epoch pretty quickly.=C2=A0 What values do you use for syncLimit, initLimit, and tickTime? Cheers, Karolos + later *insertion* P.S. Maybe something you can try and it might avoid the issue all together = is to put the servers with biggest ids (e.g., 20, 21, 12) in the server tha= t contains the majority (hence server 11) and the remaining 2 servers in se= rver 12. I'm not sure how easy it is to do this (?)=C2=A0But it seems to me= you have the issue because 20, 21 are next to each other and hence when yo= u restart 21, server 20 starts the leader election sooner than 10, 11, and = 12.=C2=A0 was (Author: karolos): Hi Ilijana, regarding the time between restarts: It's better to avoid any specific time= interval. You might want to do this instead: After shutting down server (e.g., 21) an= d before bringing it up again, make sure that some other server (e.g., 20) = is the leader by issuing a [stat|[https://zookeeper.apache.org/doc/r3.5.5/z= ookeeperAdmin.html]] 4wl command to it. Probably this is not bullet-proof e= ither, but better than waiting any specific amount of time. By the way, looking at the logs again, I notice that server 20 timed out wh= ile waiting for an epoch pretty quickly.=C2=A0 What values do you use for syncLimit, initLimit, and tickTime? Cheers, Karolos > Leader restart shuts down all the followers > ------------------------------------------- > > Key: ZOOKEEPER-3478 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3478 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.4.10 > Reporter: Lara Catipovic > Assignee: Karolos Antoniadis > Priority: Major > > Hello ZooKeeper Community, > Could you please help me with at least clarifying a few doubts related to= ZooKeeper 3.4.10? > We have 2 servers in our system, one with 2 Zookeeper servers and the on= e with 3 - meaning that in case of failure of the server with 3 Zookeeper s= ervers, the quorum cannot be achieved. > *Server 11* > Zookeeper server 10 > Zookeeper server 11 > Zookeeper server 12 > *Server 12* > Zookeeper server 20 > Zookeeper server 21 -> Leader at the beginning of the procedure > As we were changing something in the configuration, it was needed to rest= art our servers, and to keep the quorum up, we restarted servers one by one= (first on the one with 3 servers and then the other with 2 servers). > During the restart of the one with 3 servers, the quorum was not lost - = since we restarted one by one. > Then we tried to restart the servers on the other one where we have 2 Se= rvers deployed, one by one also.=20 > The restart was executed in a small amount of time. After we restarted t= he first server 20 (follower) it joined the quorum with no errors, as expec= ted.=20 > *After we restarted the Leader server (21), all followers started to shu= t down!* > We had the same log on all the followers, but here is the example from th= e follower 20: > {panel} > Jun 27 14:49:31 [myid: 20]: WARN Connection broken for id 21, my id =3D 2= 0, error =3D > Jun 27 14:49:31 javaOFException > Jun 27 14:49:31 at java.io.DataInputStream.readInt(Unknown Source) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager$R= ecvWorker.run(QuorumCnxManager.java:1013) > Jun 27 14:49:31 [myid: 20]: INFO Accepted socket connection from /192.16= 8.1.116:18532 > Jun 27 14:49:31 [myid: 20]: WARN Exception when following the leader > Jun 27 14:49:31 OFException > Jun 27 14:49:31 at java.io.DataInputStream.readInt(Unknown Source) > Jun 27 14:49:31 at org.apache.jute.BinaryInputArchive.readInt(BinaryInpu= tArchive.java:63) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPacket.deser= ialize(QuorumPacket.java:83) > Jun 27 14:49:31 at org.apache.jute.BinaryInputArchive.readRecord(BinaryI= nputArchive.java:99) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Learner.readPacket= (Learner.java:153) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Follower.followLea= der(Follower.java:85) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:937) > Jun 27 14:49:31 [myid: 20]: WARN Connection request from old client /192= .168.1.116:18532; will be dropped if server is in r-o mode > Jun 27 14:49:31 [myid: 20]: INFO Notification: 1 (message format version= ), 12 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 12 (n.sid), 0x66 (n.peerEpoch) FOLLOWING (my state) > Jun 27 14:49:31 [myid: 20]: WARN Interrupting SendWorker > Jun 27 14:49:31 [myid: 20]: INFO Client attempting to renew session 0xa6= b9dc92aa60200 at /192.168.1.116:18532 > Jun 27 14:49:31 [myid: 20]: INFO shutdown called > Jun 27 14:49:31 java.lang.Exception: shutdown Follower > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Follower.shutdown(= Follower.java:166) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:941) > Jun 27 14:49:31 [myid: 20]: INFO Revalidating client: 0xa6b9dc92aa60200 > Jun 27 14:49:31 [myid: 20]: WARN Interrupted while waiting for message o= n queue > Jun 27 14:49:31 java.InterruptedException > Jun 27 14:49:31 at java.util.concurrent.locks.AbstractQueuedSynchronizer= $ConditionObject.reportInterruptAfterWait(Unknown Source) > Jun 27 14:49:31 at java.util.concurrent.locks.AbstractQueuedSynchronizer= $ConditionObject.awaitNanos(Unknown Source) > Jun 27 14:49:31 at java.util.concurrent.ArrayBlockingQueue.poll(Unknown = Source) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager.p= ollSendQueue(QuorumCnxManager.java:1097) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager.a= ccess$700(QuorumCnxManager.java:74) > Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager$S= endWorker.run(QuorumCnxManager.java:932) > {panel} > *Is it expected that Leader in case of its restart triggers shut down of = all its followers?*=20 > This seems to me as an unexpected behavior, but maybe I'm wrong. > =C2=A0 > After this step and the servers are up again, 20 tries to become a Leader= and server 21 accepts it and tries to follow the new Leader. > 20 received ACK messages from itself and from 21.=20 > There are also notifications sent about a new Leader to all other Zookee= per servers: > {panel} > Jun 27 14:49:31 [myid: 20]: INFO LEADING > Jun 27 14:49:31 [myid: 20]: INFO Created server with tickTime 1500 minSe= ssionTimeout 3000 maxSessionTimeout 30000 datadir /local/cudb/BCServer/vers= ion-2 snapdir /local/cudb/BCServer/version-2 > Jun 27 14:49:31 [myid: 20]: INFO LEADING - LEADER ELECTION TOOK - 213 > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x1 (n.round), LOOKING (n.state), = 21 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 21 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Follower sid: 21 : info : org.apache.zo= okeeper.server.quorum.QuorumPeer$QuorumServer@466717f0 > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version= ), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),= 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state) > {panel} > =C2=A0 > From the servers 10, 11, 12 (located on the server with 3 ZooKeeper serve= rs) it can be seen they all entered the state 'FOLLOWING' and from this ste= p we would expect the Leader to start leading, and followers to start follo= wing: > {panel} > Jun 27 14:49:32 [myid: 12]: INFO FOLLOWING > Jun 27 14:49:32 [myid: 12]: INFO Created server with tickTime 1500 minSe= ssionTimeout 3000 maxSessionTimeout 30000 datadir > Jun 27 14:49:32 [myid: 12]: INFO FOLLOWING - LEADER ELECTION TOOK - 1217 > {panel} > =C2=A0 > But, servers from our first system (10,11,12) are not able to connect to = new Leader (20), and it seems they are trying to connect to old Leader (21)= (assuming this is the case since they are all using port 4512 which corres= ponds to Server 21). This log can be seen on all servers where we have 3 Zo= oKeeper servers deployed (10,11,12): > {panel} > Jun 27 14:49:38 [myid: 12]: WARN Unexpected exception, tries=3D0, connect= ing to /192.168.1.116:4512 > Jun 27 14:49:38 java.net.SocketTimeoutException: connect timed out > Jun 27 14:49:38 at java.net.PlainSocketImpl.socketConnect(Native Method) > Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So= urce) > Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk= nown Source) > Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour= ce) > Jun 27 14:49:38 at java.net.SocksSocketImpl.connect(Unknown Source) > Jun 27 14:49:38 at java.net.Socket.connect(Unknown Source) > Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.Learner.connectToL= eader(Learner.java:231) > Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.Follower.followLea= der(Follower.java:71) > Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:937) > {panel} > *Is my understanding correct, or this port is not indicating they are try= ing to connect to the wrong server?* > =C2=A0 > Even if the Leader restart provoked the restart of all its followers, see= ms that in this step all other servers should start to connect to the new l= eader and form a quorum of followers. > Instead of that scenario, a few seconds later, timeout occurs while wait= ing for epoch for quorum (the followers never start following although they= all received notifications, and they try to connect to old leader) and the= 'new' Leader shuts down again: > {panel} > Jun 27 14:49:39 [myid: 20]: WARN Unexpected exception > *Jun 27 14:49:39 java.lang.InterruptedException: Timeout while waiting f= or epoch from quorum* > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.getEpochToP= ropose(Leader.java:896) > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.lead(Leader= .java:389) > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:950) > Jun 27 14:49:39 [myid: 20]: INFO Shutting down > Jun 27 14:49:39 [myid: 20]: INFO Shutdown called > Jun 27 14:49:39 java.lang.Exception: shutdown Leader! reason: Forcing sh= utdown > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.shutdown(Le= ader.java:517) > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:956) > Jun 27 14:49:39 [myid: 20]: INFO exception while shutting down acceptor:= java.net.SocketException: Socket closed > Jun 27 14:49:39 [myid: 20]: INFO LOOKING > Jun 27 14:49:39 [myid: 20]: INFO New election. My id =3D 20, proposed zx= id=3D0x66000012c7 > Jun 27 14:49:39 [myid: 20]: ERROR Unexpected exception causing shutdown > Jun 27 14:49:39 java.InterruptedException > Jun 27 14:49:39 at java.lang.Object.wait(Native Method) > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.getEpochToP= ropose(Leader.java:892) > Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.LearnerHandler.run= (LearnerHandler.java:358) > Jun 27 14:49:39 [myid: 20]: INFO Notification: 1 (message format version= ), 20 (n.leader), 0x66000012c7 (n.zxid), 0x1a (n.round), LOOKING (n.state),= 20 (n.sid), 0x66 (n.peerEpoch) LOOKING (my state) > Jun 27 14:49:39 [myid: 20]: WARN ******* GOODBYE /10.22.0.2:55268 ******= ** > {panel} > =C2=A0 > After 3 unsuccessfull retries from servers 10,11,12, since the quorum can= not be achieved, connection times out and followers started to shut down a= gain, After they are up, another election is triggered and new LEADER is no= w located on the first node (Server that becomes a new leader is 12): > {panel} > Jun 27 14:50:07 [myid: 12]: ERROR Unexpected exception > Jun 27 14:50:07 java.net.SocketTimeoutException: connect timed out > Jun 27 14:50:07 at java.net.PlainSocketImpl.socketConnect(Native Method) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So= urce) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk= nown Source) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour= ce) > Jun 27 14:50:07 at java.net.SocksSocketImpl.connect(Unknown Source) > Jun 27 14:50:07 at java.net.Socket.connect(Unknown Source) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Learner.connectToL= eader(Learner.java:231) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.followLea= der(Follower.java:71) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:937) > Jun 27 14:50:07 [myid: 12]: WARN Exception when following the leader > Jun 27 14:50:07 java.net.SocketTimeoutException: connect timed out > Jun 27 14:50:07 at java.net.PlainSocketImpl.socketConnect(Native Method) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So= urce) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk= nown Source) > Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour= ce) > Jun 27 14:50:07 at java.net.SocksSocketImpl.connect(Unknown Source) > Jun 27 14:50:07 at java.net.Socket.connect(Unknown Source) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Learner.connectToL= eader(Learner.java:231) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.followLea= der(Follower.java:71) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:937) > Jun 27 14:50:07 [myid: 12]: INFO shutdown called > Jun 27 14:50:07 java.lang.Exception: shutdown Follower > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.shutdown(= Follower.java:166) > Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo= rumPeer.java:941) > Jun 27 14:50:07 [myid: 12]: INFO Shutting down > Jun 27 14:50:07 [myid: 12]: INFO LOOKING > Jun 27 14:50:07 [myid: 12]: INFO New election. My id =3D 12, proposed zx= id=3D0x66000012c7 > Jun 27 14:50:08 [myid: 12]: INFO LEADING > {panel} > =C2=A0 > After this, all other Zookeeper servers normally start to follow the new = leader and everything starts to work just fine. > =C2=A0 > Could you please help me and answer the following questions: > - is it expected behavior that Leader shutdowns all other servers (follo= wers) during after its own restart? > -> if this is expected, could you please explain in which situations we = can expect this behavior and why? > - if there was a notification sent about the new leader (20) to all othe= r servers, why they were still connecting to old leader? > - do you have any recommendations on how to 'fix' this behavior? > Any help will be highly appreciated. > Thanks in advance! > Kind regards, > Lara -- This message was sent by Atlassian Jira (v8.3.2#803003)