zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Zhang <yangzhangr...@hotmail.com>
Subject Re: Zookeeper leader election takes a long time.
Date Thu, 13 Oct 2016 22:48:20 GMT
Btw, also, from n3.log it looks like id: 3 actually started to follow id:2

016-10-03 12:39:18,682 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x7 (n.round),
LEADING (n.state), 2 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:39:18,722 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x7 (n.round),
FOLLOWING (n.state), 1 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:39:18,722 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002:QuorumPeer@784]
- FOLLOWING
2016-10-03 12:39:18,722 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002:ZooKeeperServer@162]
- Created server with tickTime 5000 minSessionTimeout 10000 maxSessionTimeout 100000 datadir
/var/lib/zookeeper/version-2 snapdir /var/lib/zookeeper/version-2
2016-10-03 12:39:18,722 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002:Follower@63] -
FOLLOWING - LEADER ELECTION TOOK - 182
2016-10-03 12:39:18,756 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002:Learner@323] -
Getting a diff from the leader 0x300000130
2016-10-03 12:39:18,760 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:5002:Learner@374] -
Got zxid 0x300000001 expected 0x1
2016-10-03 12:39:18,763 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002:FileTxnSnapLog@240]
- Snapshotting: 0x300000130 to /var/lib/zookeeper/version-2/snapshot.300000130
2016-10-03 12:39:18,801 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:5002:Follower@118]
- Got zxid 0x300000133 expected 0x1


On Oct 13, 2016, at 3:41 PM, Ryan Zhang <yangzhangrice@hotmail.com<mailto:yangzhangrice@hotmail.com>>
wrote:

Hi, Anand, I took a look and I wonder how do you explain the following

The N1.log starts at around
2016-10-03 12:37:38,073 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:QuorumPeer@714]
- LOOKING

and it failed to talk to Node 2 (id: 2)
2016-10-03 12:37:38,136 [myid:1] - WARN  [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot
open channel to 2 at election address node2.controller.local/127.0.0.8:5004
java.net.ConnectException: Connection refused
       at java.net.PlainSocketImpl.socketConnect(Native Method)
       at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
       at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
       at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
       at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
       at java.net.Socket.connect(Socket.java:579)

However, after failed to get any election messages from Node 3, it started a new round but
then I seem to see the election messages from Node 2 again. Any idea why?

2016-10-03 12:38:32,815 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:QuorumPeer@714]
- LOOKING
2016-10-03 12:38:32,816 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:FileSnap@83] -
Reading snapshot /var/lib/zookeeper/version-2/snapshot.20002aa0b
2016-10-03 12:38:32,875 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:FastLeaderElection@815]
- New election. My id =  1, proposed zxid=0x20002bfb5
2016-10-03 12:38:32,876 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 1 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:32,878 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LOOKING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:32,878 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:32,917 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 3 (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round),
LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:33,117 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:FastLeaderElection@849]
- Notification time out: 400
2016-10-03 12:38:33,118 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:33,159 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 3 (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round),
LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:33,559 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:FastLeaderElection@849]
- Notification time out: 800
2016-10-03 12:38:33,560 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:33,561 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round),
LEADING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:33,601 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] -
Notification: 1 (message format version), 3 (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round),
LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
2016-10-03 12:38:34,402 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002:FastLeaderElection@849]
- Notification time out: 1600



On Oct 8, 2016, at 10:05 AM, Anand Parthasarathy <anpartha@avinetworks.com<mailto:anpartha@avinetworks.com><mailto:anpartha@avinetworks.com>>
wrote:

Hi Flavio,

I have attached the logs from node 1 and node 3. Node 2 was powered off around 10-03 12:36.
Leader election kept going until 10-03 15:57:16 when it finally converged.

Thanks,
Anand.

On Sat, Oct 8, 2016 at 7:55 AM, Flavio Junqueira <fpj@apache.org<mailto:fpj@apache.org><mailto:fpj@apache.org>>
wrote:
Hi Anand,

I don't understand whether 1 and 3 were able or even trying to connect to each other. They
should be able to elect a leader between them and make progress. You might want to upload
logs and let us know.

-Flavio

On 08 Oct 2016, at 02:11, Anand Parthasarathy <anpartha@avinetworks.com<mailto:anpartha@avinetworks.com><mailto:anpartha@avinetworks.com>>
wrote:

Hi,

We are currently using zookeeper 3.4.6 version and use a 3 node solution in
our system. We see that occasionally, when a node is powered off (in this
instance, it was actually a leader node), the remaining two nodes do not
form a quorum for a really long time. Looking at the logs, it appears the
sequence is as follows:
- Node 2 is the zookeeper leader
- Node 2 is powered off
- Node 1 and Node 3 recognize and start the election
- Node 3 times out after initLimit * tickTime with "Timeout while waiting
for quorum" for Round N
- Node 1 times out after initLimit * tickTime with "Exception while trying
to follow leader" for Round N+1 at the same time.
- And the process continues where N is sequentially incrementing.
- This happens for a long time.
- In one instance, we used tickTime=5000 and initLimit=20 and it took
around 3.5 hours to converge.
- In a given round, Node 1 will try connecting to Node 2, gets connection
refused waits for notification timeout which increases by 2 every iteration
until it hits the initLimit. Connection Refused is because the node 2 comes
up after reboot, but zookeeper process is not started (due to a different
failure).

It looks similar to ZOOKEEPER-2164 but there it is a connection timeout
where Node 2 is not reachable.

Could you pls. share if you have seen this issue and if so, what is the
workaround that can be employed in 3.4.6.

Thanks,
Anand.


<log.tgz>



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message