zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com.INVALID>
Subject Re: [**SPAM**] RE: ZK Server does not join quorum after restart
Date Fri, 25 Jan 2019 12:14:07 GMT
Hi Ian,

Would you please attach logs from all participants of the ensemble or try
to find an exception from when the follower is trying to join?

Regards,
Andor



On Fri, Jan 25, 2019 at 1:37 AM Ian Spence <Ian.Spence@globalrelay.net>
wrote:

> Hi Daniel,
>
> Thanks for the quick reply. We use static IP addresses on all of the
> servers so it did not change after the reboot.
>
> Thanks,
> -Ian
>
> From: Daniel Chan <daniel.cw.chan@oracle.com> on behalf of Daniel Chan <
> daniel.cw.chan@oracle.com>
> Reply-To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
> Date: Thursday, January 24, 2019 at 16:36
> To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
> Subject: [**SPAM**] RE: ZK Server does not join quorum after restart
>
>
> If its IP address got changed, then you hit a known bug
> https://issues.apache.org/jira/browse/ZOOKEEPER-1506  and you need to
> bounce the cluster.
>
> Thanks,
> Daniel
>
> -----Original Message-----
> From: Ian Spence <Ian.Spence@globalrelay.net<mailto:
> Ian.Spence@globalrelay.net>>
> Sent: Thursday, January 24, 2019 2:36 PM
> To: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>
> Subject: ZK Server does not join quorum after restart
>
> Hello
>
> We have a cluster of 5 ZK servers, all running ZK 3.4.6 on Java 1.8 on
> CentOS 6. These are physical devices, not virtual machines.
>
> One server required hardware maintenance, and was restarted. When the zk
> software was restarted, it did not rejoin the quorum as a follower.
>
> Running “stat” or “mntr” commands returns: “This ZooKeeper instance is not
> currently serving requests”
>
> I googled this message and came across this bug:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D2164&d=DwIGaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=JE3yjNS4hXa8nS9n2uFCwEqMvv18hzzEnqunUhCoEns&m=S_8TazqwUbEfRtAYQCn8kA7F2tiGUBaVr3c_nj0Fh8A&s=FGIs9YOjwdYrzBH8om70Jx11KemHKRDsMY_kZK6cpK0&e=
>
> Does anybody know if there is a work-around to this issue? We’ve seen this
> problem multiple times in the past and our current solution is to bring
> down the zk cluster (which is a huge outage-causing pain).
>
> Thanks
>
> - Ian
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message