zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasanth Mathialagan <prasanthmathiala...@gmail.com>
Subject Re: Help Needed: Leadership Issue upon ZK Restart (ZooKeeper 3.4.9)
Date Fri, 11 May 2018 23:35:31 GMT
Is 1.1.1.143:3888 reachable from the host in which you see this error?

On Fri, May 11, 2018 at 3:11 PM, Raghav <raghavastic@gmail.com> wrote:

> Hi
>
> We have a 3 node zk ensemble as well as 3 node Kafka Cluster. They both are
> hosted on the same 3 VMs.
>
> Before Restart
> 1. We were on Kafka 0.10.2.1
>
> After Restart
> 1. We moved to Kafka 1.1
>
> We observe that Kafkas report leadership issues, and for lot of partitions
> Leader is -1. I see some logs in ZK that mainly point towards some
> connectivity issue around restart time.
>
> *We are stuck on this one for a while now, and neither rolling restart of
> ZK is helping. Can you please help or point us how we can debug this.*
>
> *2018-05-11_17:20:49.00305 2018-05-11 17:20:49,002 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@600] - Notification: 1 (message
> format version), 1 (n.leader), 0x200000112 (n.zxid), 0x1 (n.round), LOOKING
> (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my
> state)                                    2018-05-11_17:20:49.01201
> 2018-05-11 17:20:49,010 [myid:1] - WARN
> [WorkerSender[myid=1]:QuorumCnxManager@400] - Cannot open channel to 2 at
> election address /1.1.1.143:3888
> <http://1.1.1.143:3888>
> 2018-05-11_17:20:49.01203 java.net.ConnectException: Connection
> refused
> 2018-05-11_17:20:49.01203       at
> java.net.PlainSocketImpl.socketConnect(Native
> Method)
> 2018-05-11_17:20:49.01203       at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> 345)
> 2018-05-11_17:20:49.01203       at
> java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:206)
> 2018-05-11_17:20:49.01204       at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> 2018-05-11_17:20:49.01204       at
> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> 2018-05-11_17:20:49.01204       at
> java.net.Socket.connect(Socket.java:589)
> 2018-05-11_17:20:49.01204       at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:381)
> 2018-05-11_17:20:49.01204       at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> toSend(QuorumCnxManager.java:354)
> 2018-05-11_17:20:49.01205       at
> org.apache.zookeeper.server.quorum.FastLeaderElection$
> Messenger$WorkerSender.process(FastLeaderElection.java:452)
> 2018-05-11_17:20:49.01205       at
> org.apache.zookeeper.server.quorum.FastLeaderElection$
> Messenger$WorkerSender.run(FastLeaderElection.java:433)
> 2018-05-11_17:20:49.01206       at java.lang.Thread.run(Thread.java:745)*
>
>
> Raghav
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message