From issues-return-866-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org  Tue Sep  3 17:21:02 2019
Return-Path: <issues-return-866-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 30A58180674
	for <archive-asf-public@cust-asf.ponee.io>; Tue,  3 Sep 2019 19:21:02 +0200 (CEST)
Received: (qmail 24852 invoked by uid 500); 3 Sep 2019 19:48:21 -0000
Mailing-List: contact issues-help@zookeeper.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@zookeeper.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@zookeeper.apache.org>
List-Post: <mailto:issues@zookeeper.apache.org>
List-Id: <issues.zookeeper.apache.org>
Reply-To: dev@zookeeper.apache.org
Delivered-To: mailing list issues@zookeeper.apache.org
Received: (qmail 24841 invoked by uid 99); 3 Sep 2019 19:48:21 -0000
Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Sep 2019 19:48:21 +0000
Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id BFB59E30C4
	for <issues@zookeeper.apache.org>; Tue,  3 Sep 2019 17:21:00 +0000 (UTC)
Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1])
	by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2BD7C7806CB
	for <issues@zookeeper.apache.org>; Tue,  3 Sep 2019 17:21:00 +0000 (UTC)
Date: Tue, 3 Sep 2019 17:21:00 +0000 (UTC)
From: "Karolos Antoniadis (Jira)" <jira@apache.org>
To: issues@zookeeper.apache.org
Message-ID: <JIRA.13247835.1564487794000.98162.1567531260178@Atlassian.JIRA>
In-Reply-To: <JIRA.13247835.1564487794000@Atlassian.JIRA>
References: <JIRA.13247835.1564487794000@Atlassian.JIRA> <JIRA.13247835.1564487794806@jira-he-de>
Subject: [jira] [Comment Edited] (ZOOKEEPER-3478) Leader restart shuts down
 all the followers
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3478?page=3Dcom.atlas=
sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D=
16919707#comment-16919707 ]=20

Karolos Antoniadis edited comment on ZOOKEEPER-3478 at 9/3/19 5:20 PM:
-----------------------------------------------------------------------

Hi Ilijana,

regarding the time between restarts: It's better to avoid any specific time=
 interval.
 You might want to do this instead: After shutting down server (e.g., 21) a=
nd before bringing it up again, make sure that some other server (e.g., 20)=
 is the leader by issuing a [stat|[https://zookeeper.apache.org/doc/r3.5.5/=
zookeeperAdmin.html]] 4wl command to it. Probably this is not bullet-proof =
either, but better than waiting any specific amount of time.

By the way, looking at the logs again, I notice that server 20 timed out wh=
ile waiting for an epoch pretty quickly.=C2=A0
 What values do you use for syncLimit, initLimit, and tickTime?

Cheers,
 Karolos

+ later *insertion*
P.S. Maybe something you can try and it might avoid the issue all together =
is to put the servers with biggest ids (e.g., 20, 21, 12) in the server tha=
t contains the majority (hence server 11) and the remaining 2 servers in se=
rver 12. I'm not sure how easy it is to do this (?)=C2=A0But it seems to me=
 you have the issue because 20, 21 are next to each other and hence when yo=
u restart 21, server 20 starts the leader election sooner than 10, 11, and =
12.=C2=A0


was (Author: karolos):
Hi Ilijana,

regarding the time between restarts: It's better to avoid any specific time=
 interval.
You might want to do this instead: After shutting down server (e.g., 21) an=
d before bringing it up again, make sure that some other server (e.g., 20) =
is the leader by issuing a [stat|[https://zookeeper.apache.org/doc/r3.5.5/z=
ookeeperAdmin.html]] 4wl command to it. Probably this is not bullet-proof e=
ither, but better than waiting any specific amount of time.

By the way, looking at the logs again, I notice that server 20 timed out wh=
ile waiting for an epoch pretty quickly.=C2=A0
What values do you use for syncLimit, initLimit, and tickTime?

Cheers,
Karolos

> Leader restart shuts down all the followers
> -------------------------------------------
>
>                 Key: ZOOKEEPER-3478
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3478
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.10
>            Reporter: Lara Catipovic
>            Assignee: Karolos Antoniadis
>            Priority: Major
>
> Hello ZooKeeper Community,
> Could you please help me with at least clarifying a few doubts related to=
 ZooKeeper 3.4.10?
>  We have 2 servers in our system, one with 2 Zookeeper servers and the on=
e with 3 - meaning that in case of failure of the server with 3 Zookeeper s=
ervers, the quorum cannot be achieved.
> *Server 11*
>  Zookeeper server 10
>  Zookeeper server 11
>  Zookeeper server 12
> *Server 12*
>  Zookeeper server 20
>  Zookeeper server 21 -> Leader at the beginning of the procedure
> As we were changing something in the configuration, it was needed to rest=
art our servers, and to keep the quorum up, we restarted servers one by one=
 (first on the one with 3 servers and then the other with 2 servers).
>  During the restart of the one with 3 servers, the quorum was not lost - =
since we restarted one by one.
>  Then we tried to restart the servers on the other one where we have 2 Se=
rvers deployed, one by one also.=20
>  The restart was executed in a small amount of time. After we restarted t=
he first server 20 (follower) it joined the quorum with no errors, as expec=
ted.=20
>  *After we restarted the Leader server (21), all followers started to shu=
t down!*
> We had the same log on all the followers, but here is the example from th=
e follower 20:
> {panel}
> Jun 27 14:49:31 [myid: 20]: WARN Connection broken for id 21, my id =3D 2=
0, error =3D
>  Jun 27 14:49:31 javaOFException
>  Jun 27 14:49:31 at java.io.DataInputStream.readInt(Unknown Source)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager$R=
ecvWorker.run(QuorumCnxManager.java:1013)
>  Jun 27 14:49:31 [myid: 20]: INFO Accepted socket connection from /192.16=
8.1.116:18532
>  Jun 27 14:49:31 [myid: 20]: WARN Exception when following the leader
>  Jun 27 14:49:31 OFException
>  Jun 27 14:49:31 at java.io.DataInputStream.readInt(Unknown Source)
>  Jun 27 14:49:31 at org.apache.jute.BinaryInputArchive.readInt(BinaryInpu=
tArchive.java:63)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPacket.deser=
ialize(QuorumPacket.java:83)
>  Jun 27 14:49:31 at org.apache.jute.BinaryInputArchive.readRecord(BinaryI=
nputArchive.java:99)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Learner.readPacket=
(Learner.java:153)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Follower.followLea=
der(Follower.java:85)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:937)
>  Jun 27 14:49:31 [myid: 20]: WARN Connection request from old client /192=
.168.1.116:18532; will be dropped if server is in r-o mode
>  Jun 27 14:49:31 [myid: 20]: INFO Notification: 1 (message format version=
), 12 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 12 (n.sid), 0x66 (n.peerEpoch) FOLLOWING (my state)
>  Jun 27 14:49:31 [myid: 20]: WARN Interrupting SendWorker
>  Jun 27 14:49:31 [myid: 20]: INFO Client attempting to renew session 0xa6=
b9dc92aa60200 at /192.168.1.116:18532
>  Jun 27 14:49:31 [myid: 20]: INFO shutdown called
>  Jun 27 14:49:31 java.lang.Exception: shutdown Follower
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.Follower.shutdown(=
Follower.java:166)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:941)
>  Jun 27 14:49:31 [myid: 20]: INFO Revalidating client: 0xa6b9dc92aa60200
>  Jun 27 14:49:31 [myid: 20]: WARN Interrupted while waiting for message o=
n queue
>  Jun 27 14:49:31 java.InterruptedException
>  Jun 27 14:49:31 at java.util.concurrent.locks.AbstractQueuedSynchronizer=
$ConditionObject.reportInterruptAfterWait(Unknown Source)
>  Jun 27 14:49:31 at java.util.concurrent.locks.AbstractQueuedSynchronizer=
$ConditionObject.awaitNanos(Unknown Source)
>  Jun 27 14:49:31 at java.util.concurrent.ArrayBlockingQueue.poll(Unknown =
Source)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager.p=
ollSendQueue(QuorumCnxManager.java:1097)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager.a=
ccess$700(QuorumCnxManager.java:74)
>  Jun 27 14:49:31 at org.apache.zookeeper.server.quorum.QuorumCnxManager$S=
endWorker.run(QuorumCnxManager.java:932)
> {panel}
> *Is it expected that Leader in case of its restart triggers shut down of =
all its followers?*=20
>  This seems to me as an unexpected behavior, but maybe I'm wrong.
> =C2=A0
> After this step and the servers are up again, 20 tries to become a Leader=
 and server 21 accepts it and tries to follow the new Leader.
>  20 received ACK messages from itself and from 21.=20
>  There are also notifications sent about a new Leader to all other Zookee=
per servers:
> {panel}
> Jun 27 14:49:31 [myid: 20]: INFO LEADING
>  Jun 27 14:49:31 [myid: 20]: INFO Created server with tickTime 1500 minSe=
ssionTimeout 3000 maxSessionTimeout 30000 datadir /local/cudb/BCServer/vers=
ion-2 snapdir /local/cudb/BCServer/version-2
>  Jun 27 14:49:31 [myid: 20]: INFO LEADING - LEADER ELECTION TOOK - 213
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 20 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x1 (n.round), LOOKING (n.state), =
21 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 21 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Follower sid: 21 : info : org.apache.zo=
okeeper.server.quorum.QuorumPeer$QuorumServer@466717f0
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 12 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 11 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
>  Jun 27 14:49:32 [myid: 20]: INFO Notification: 1 (message format version=
), 21 (n.leader), 0x66000012c7 (n.zxid), 0x19 (n.round), LOOKING (n.state),=
 10 (n.sid), 0x66 (n.peerEpoch) LEADING (my state)
> {panel}
> =C2=A0
> From the servers 10, 11, 12 (located on the server with 3 ZooKeeper serve=
rs) it can be seen they all entered the state 'FOLLOWING' and from this ste=
p we would expect the Leader to start leading, and followers to start follo=
wing:
> {panel}
> Jun 27 14:49:32 [myid: 12]: INFO FOLLOWING
>  Jun 27 14:49:32 [myid: 12]: INFO Created server with tickTime 1500 minSe=
ssionTimeout 3000 maxSessionTimeout 30000 datadir
>  Jun 27 14:49:32 [myid: 12]: INFO FOLLOWING - LEADER ELECTION TOOK - 1217
> {panel}
> =C2=A0
> But, servers from our first system (10,11,12) are not able to connect to =
new Leader (20), and it seems they are trying to connect to old Leader (21)=
 (assuming this is the case since they are all using port 4512 which corres=
ponds to Server 21). This log can be seen on all servers where we have 3 Zo=
oKeeper servers deployed (10,11,12):
> {panel}
> Jun 27 14:49:38 [myid: 12]: WARN Unexpected exception, tries=3D0, connect=
ing to /192.168.1.116:4512
>  Jun 27 14:49:38 java.net.SocketTimeoutException: connect timed out
>  Jun 27 14:49:38 at java.net.PlainSocketImpl.socketConnect(Native Method)
>  Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So=
urce)
>  Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk=
nown Source)
>  Jun 27 14:49:38 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour=
ce)
>  Jun 27 14:49:38 at java.net.SocksSocketImpl.connect(Unknown Source)
>  Jun 27 14:49:38 at java.net.Socket.connect(Unknown Source)
>  Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.Learner.connectToL=
eader(Learner.java:231)
>  Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.Follower.followLea=
der(Follower.java:71)
>  Jun 27 14:49:38 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:937)
> {panel}
> *Is my understanding correct, or this port is not indicating they are try=
ing to connect to the wrong server?*
> =C2=A0
> Even if the Leader restart provoked the restart of all its followers, see=
ms that in this step all other servers should start to connect to the new l=
eader and form a quorum of followers.
>  Instead of that scenario, a few seconds later, timeout occurs while wait=
ing for epoch for quorum (the followers never start following although they=
 all received notifications, and they try to connect to old leader) and the=
 'new' Leader shuts down again:
> {panel}
> Jun 27 14:49:39 [myid: 20]: WARN Unexpected exception
>  *Jun 27 14:49:39 java.lang.InterruptedException: Timeout while waiting f=
or epoch from quorum*
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.getEpochToP=
ropose(Leader.java:896)
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.lead(Leader=
.java:389)
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:950)
>  Jun 27 14:49:39 [myid: 20]: INFO Shutting down
>  Jun 27 14:49:39 [myid: 20]: INFO Shutdown called
>  Jun 27 14:49:39 java.lang.Exception: shutdown Leader! reason: Forcing sh=
utdown
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.shutdown(Le=
ader.java:517)
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:956)
>  Jun 27 14:49:39 [myid: 20]: INFO exception while shutting down acceptor:=
 java.net.SocketException: Socket closed
>  Jun 27 14:49:39 [myid: 20]: INFO LOOKING
>  Jun 27 14:49:39 [myid: 20]: INFO New election. My id =3D 20, proposed zx=
id=3D0x66000012c7
>  Jun 27 14:49:39 [myid: 20]: ERROR Unexpected exception causing shutdown
>  Jun 27 14:49:39 java.InterruptedException
>  Jun 27 14:49:39 at java.lang.Object.wait(Native Method)
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.Leader.getEpochToP=
ropose(Leader.java:892)
>  Jun 27 14:49:39 at org.apache.zookeeper.server.quorum.LearnerHandler.run=
(LearnerHandler.java:358)
>  Jun 27 14:49:39 [myid: 20]: INFO Notification: 1 (message format version=
), 20 (n.leader), 0x66000012c7 (n.zxid), 0x1a (n.round), LOOKING (n.state),=
 20 (n.sid), 0x66 (n.peerEpoch) LOOKING (my state)
>  Jun 27 14:49:39 [myid: 20]: WARN ******* GOODBYE /10.22.0.2:55268 ******=
**
> {panel}
> =C2=A0
> After 3 unsuccessfull retries from servers 10,11,12, since the quorum can=
 not be achieved, connection times out and followers started to shut down a=
gain, After they are up, another election is triggered and new LEADER is no=
w located on the first node (Server that becomes a new leader is 12):
> {panel}
> Jun 27 14:50:07 [myid: 12]: ERROR Unexpected exception
>  Jun 27 14:50:07 java.net.SocketTimeoutException: connect timed out
>  Jun 27 14:50:07 at java.net.PlainSocketImpl.socketConnect(Native Method)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So=
urce)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk=
nown Source)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour=
ce)
>  Jun 27 14:50:07 at java.net.SocksSocketImpl.connect(Unknown Source)
>  Jun 27 14:50:07 at java.net.Socket.connect(Unknown Source)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Learner.connectToL=
eader(Learner.java:231)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.followLea=
der(Follower.java:71)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:937)
>  Jun 27 14:50:07 [myid: 12]: WARN Exception when following the leader
>  Jun 27 14:50:07 java.net.SocketTimeoutException: connect timed out
>  Jun 27 14:50:07 at java.net.PlainSocketImpl.socketConnect(Native Method)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.doConnect(Unknown So=
urce)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connectToAddress(Unk=
nown Source)
>  Jun 27 14:50:07 at java.net.AbstractPlainSocketImpl.connect(Unknown Sour=
ce)
>  Jun 27 14:50:07 at java.net.SocksSocketImpl.connect(Unknown Source)
>  Jun 27 14:50:07 at java.net.Socket.connect(Unknown Source)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Learner.connectToL=
eader(Learner.java:231)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.followLea=
der(Follower.java:71)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:937)
>  Jun 27 14:50:07 [myid: 12]: INFO shutdown called
>  Jun 27 14:50:07 java.lang.Exception: shutdown Follower
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.Follower.shutdown(=
Follower.java:166)
>  Jun 27 14:50:07 at org.apache.zookeeper.server.quorum.QuorumPeer.run(Quo=
rumPeer.java:941)
>  Jun 27 14:50:07 [myid: 12]: INFO Shutting down
>  Jun 27 14:50:07 [myid: 12]: INFO LOOKING
>  Jun 27 14:50:07 [myid: 12]: INFO New election. My id =3D 12, proposed zx=
id=3D0x66000012c7
>  Jun 27 14:50:08 [myid: 12]: INFO LEADING
> {panel}
> =C2=A0
> After this, all other Zookeeper servers normally start to follow the new =
leader and everything starts to work just fine.
> =C2=A0
> Could you please help me and answer the following questions:
>  - is it expected behavior that Leader shutdowns all other servers (follo=
wers) during after its own restart?
>  -> if this is expected, could you please explain in which situations we =
can expect this behavior and why?
>  - if there was a notification sent about the new leader (20) to all othe=
r servers, why they were still connecting to old leader?
>  - do you have any recommendations on how to 'fix' this behavior?
> Any help will be highly appreciated.
>  Thanks in advance!
> Kind regards,
>  Lara


--
This message was sent by Atlassian Jira
(v8.3.2#803003)