zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Spence <Ian.Spe...@globalrelay.net>
Subject Re: [**SPAM**] RE: ZK Server does not join quorum after restart
Date Fri, 25 Jan 2019 00:37:10 GMT
Hi Daniel,

Thanks for the quick reply. We use static IP addresses on all of the servers so it did not
change after the reboot.

Thanks,
-Ian

From: Daniel Chan <daniel.cw.chan@oracle.com> on behalf of Daniel Chan <daniel.cw.chan@oracle.com>
Reply-To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
Date: Thursday, January 24, 2019 at 16:36
To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
Subject: [**SPAM**] RE: ZK Server does not join quorum after restart


If its IP address got changed, then you hit a known bug https://issues.apache.org/jira/browse/ZOOKEEPER-1506
 and you need to bounce the cluster.

Thanks,
Daniel

-----Original Message-----
From: Ian Spence <Ian.Spence@globalrelay.net<mailto:Ian.Spence@globalrelay.net>>
Sent: Thursday, January 24, 2019 2:36 PM
To: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>
Subject: ZK Server does not join quorum after restart

Hello

We have a cluster of 5 ZK servers, all running ZK 3.4.6 on Java 1.8 on CentOS 6. These are
physical devices, not virtual machines.

One server required hardware maintenance, and was restarted. When the zk software was restarted,
it did not rejoin the quorum as a follower.

Running “stat” or “mntr” commands returns: “This ZooKeeper instance is not currently
serving requests”

I googled this message and came across this bug: https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D2164&d=DwIGaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=JE3yjNS4hXa8nS9n2uFCwEqMvv18hzzEnqunUhCoEns&m=S_8TazqwUbEfRtAYQCn8kA7F2tiGUBaVr3c_nj0Fh8A&s=FGIs9YOjwdYrzBH8om70Jx11KemHKRDsMY_kZK6cpK0&e=

Does anybody know if there is a work-around to this issue? We’ve seen this problem multiple
times in the past and our current solution is to bring down the zk cluster (which is a huge
outage-causing pain).

Thanks

- Ian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message