zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Kosecki <jan.koseck...@gmail.com>
Subject [ZOOKEEPER-2164] This ZooKeeper instance is not currently serving requests
Date Wed, 22 Jan 2020 11:40:46 GMT
Hello,

I have a kubernetes cluster with 3 zookeeper nodes (as stateful set) that,
for cost-saving purposes, it downscaled every evening and upscaled in the
morning.
Since the upgrade to 3.5.6 (previously I was using using zookeeper shipped
with Kafka 2.3.0 archive) the nodes are experiencing issues with
establishing the quorum.
I believe it's related to the JIRA ticket
https://issues.apache.org/jira/browse/ZOOKEEPER-2164. However, although it
seems to me as a quite a serious bug, the ticket is stale. Have there been
any other steps performed to fix that issue or are there any workarounds?

Currently, 2 out of 3 nodes have established a quorum and second node is
always returning "This ZooKeeper instance is not currently serving
requests" although there are no errors in the logs, only some repeated
lines about FastLeaderElection:

[2020-01-22 11:38:46,076] INFO Have smaller server identifier, so dropping
the connection: (3, 2) (org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2020-01-22 11:38:46,076] INFO Notification: 2 (message format version), 2
(n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x0
(n.peerEPoch), LOOKING (my state)0 (n.config version)
(org.apache.zookeeper.server.quorum.FastLeaderElection)
[2020-01-22 11:38:46,078] INFO Notification: 2 (message format version), 1
(n.leader), 0x100000010 (n.zxid), 0x1 (n.round), LEADING (n.state), 1
(n.sid), 0x2 (n.peerEPoch), LOOKING (my state)0 (n.config version)
(org.apache.zookeeper.server.quorum.FastLeaderElection)

I understand that scaling zookeeper to 1 node and then scaling up step by
step should allow them to establish correct quorum of 3 but I don't want to
have to do things like this every morning.
Also, as far as I understand the issue correctly, if I were now to perform
a rolling update of the zookeeper pods (which happens in reverse order) the
pods wouldn't establish any quorum again.

Thanks,
Jan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message