zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Fine <af...@apache.org>
Subject Re: ZooKeeper Cluster Failed (and did not recover)
Date Thu, 30 Nov 2017 00:52:34 GMT
Hi Jim-

What version of ZooKeeper are you using?

It would be great to get a better understanding of what is causing:
Nov 29 19:32:12 localhost java.lang.Exception: shutdown Leader! reason:
Not sufficient followers synced, only synced with sids: [ 3,4 ]
Nov 29 19:32:12 localhost     at
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:517)
Nov 29 19:32:12 localhost     at
org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:492)

This failure appears to align with:
Nov 29 19:32:12 localhost zookeeper: WARN 
[QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader

Is there any possibility debug logs could be enabled?

Thanks,
Abe

Would you be able to turn on debug logging, I think that might give us
much more visibility?
On Wed, Nov 29, 2017, at 18:06, Jim Wong wrote:
> Hey, folks.  We’re running a 5-node ZooKeeper cluster to manage server
> configuration and discovery.  This morning, we experienced an outage in
> which the entire cluster went offline for approximately 7 minutes. 
> Restarting the ZooKeeper daemons on each node in the cluster got things
> working again, so there doesn’t appear to have been an underlying network
> or hardware fault that would explain the cluster’s inability to recover.
> 
> This is actually the second time this has happened in the last several
> months, and all of our attempts to reproduce the behavior by
> intentionally doing bad things to the cluster or the leader have failed.
> 
> With all that in mind, we were wondering if anyone had any thoughts as to
> what might be going on, as we’re out of ideas.  If anyone can provide any
> hints or guidance, we’d be very grateful.
> 
> I’ve included brief excerpts from logs from some of the nodes; the
> problem started around 17:32:12 and wasn’t resolved until 17:39, after
> we’d restarted ZooKeeper.  This was was originally the leader:
> 
> Nov 29 19:31:44 localhost zookeeper: INFO  [ProcessThread(sid:4
> cport:-1)::PrepRequestProcessor@486] - Processed session termination for
> sessionid: 0x15fdd1d1e0c0313
> Nov 29 19:32:11 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:11 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:11 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:12 localhost zookeeper: ERROR
> [LearnerHandler-/10.1.29.76:49198:LearnerHandler@648] - Unexpected
> exception causing shutdown while sock still open
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [LearnerHandler-/10.1.29.76:49198:LearnerHandler@661] - ******* GOODBYE
> /10.1.29.76:49198 ********
> Nov 29 19:32:12 localhost zookeeper: ERROR
> [LearnerHandler-/10.1.48.67:44882:LearnerHandler@648] - Unexpected
> exception causing shutdown while sock still open
> Nov 29 19:32:12 localhost zookeeper: ERROR
> [LearnerHandler-/10.1.0.252:50932:LearnerHandler@648] - Unexpected
> exception causing shutdown while sock still open
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [LearnerHandler-/10.1.0.252:50932:LearnerHandler@661] - ******* GOODBYE
> /10.1.0.252:50932 ********
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [LearnerHandler-/10.1.48.67:44882:LearnerHandler@661] - ******* GOODBYE
> /10.1.48.67:44882 ********
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LEADING (my
> state)
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LEADING (my
> state)
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:Leader@511] - Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:Leader@517] - Shutdown called
> Nov 29 19:32:12 localhost java.lang.Exception: shutdown Leader! reason:
> Not sufficient followers synced, only synced with sids: [ 3,4 ]
> Nov 29 19:32:12 localhost     at
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:517)
> Nov 29 19:32:12 localhost     at
> org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:492)
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [LearnerCnxAcceptor-zookeeper-4-internal.prod.airtime.com/10.1.72.145:2888:Leader$LearnerCnxAcceptor@331]
> - exception while shutting down acceptor: java.net.SocketException:
> Socket closed
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1044] - Closed
> socket connection for client /10.1.10.8:42938 which had sessionid
> 0x45f3a424dd20208
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@505] - shutting
> down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:SessionTrackerImpl@225] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:PrepRequestProcessor@764] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO  [ProcessThread(sid:4
> cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:ProposalRequestProcessor@88] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:CommitProcessor@184] - Shutting
> down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:Leader$ToBeAppliedRequestProcessor@676]
> - Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@402] -
> shutdown of request processor complete
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@208] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [SyncThread:4:SyncRequestProcessor@186] - SyncRequestProcessor exited!
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [CommitProcessor:4:CommitProcessor@153] - CommitProcessor exited loop!
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [LearnerHandler-/10.1.37.117:52956:LearnerHandler@661] - ******* GOODBYE
> /10.1.37.117:52956 ********
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [LearnerHandler-/10.1.37.117:52956:LearnerHandler@673] - Ignoring
> unexpected exception
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:QuorumPeer@865] - LOOKING
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@818] - New
> election. My id =  4, proposed zxid=0x2b0260b5a8
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:12 zookeeper-4 dhclient[2170]: DHCPREQUEST on eth0 to
> 10.1.64.1 port 67 (xid=0x79592ad4)
> Nov 29 19:32:12 zookeeper-4 dhclient[2170]: DHCPACK from 10.1.64.1
> (xid=0x79592ad4)
> Nov 29 19:32:12 zookeeper-4 dhclient[2170]: bound to 10.1.72.145 --
> renewal in 1608 seconds.
> Nov 29 19:32:12 zookeeper-4 ec2net: [get_meta] Trying to get
> http://169.254.169.254/latest/meta-data/network/interfaces/macs/02:c0:01:e2:a6:b2/local-ipv4s
> Nov 29 19:32:12 zookeeper-4 ec2net: [rewrite_aliases] Rewriting aliases
> of eth0
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 400
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:13 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:13 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 800
> Nov 29 19:32:13 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:13 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:13 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:14 localhost zookeeper: INFO 
> [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
> Nov 29 19:32:14 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 1600
> Nov 29 19:32:14 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:15 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 3200
> Nov 29 19:32:15 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:18 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 6400
> Nov 29 19:32:18 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:23 zookeeper-4 dhclient[2295]: XMT: Solicit on eth0,
> interval 125280ms.
> Nov 29 19:32:25 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 12800
> Nov 29 19:32:25 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:38 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 25600
> Nov 29 19:32:38 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:33:03 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 51200
> Nov 29 19:33:03 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:33:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:33:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:34:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:34:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:35:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:35:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:36:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:36:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:37:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:37:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:38:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:38:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 4 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: WARN 
> [RecvWorker:5:QuorumCnxManager$RecvWorker@1028] - Connection broken for
> id 5, my id = 4, error = 
> Nov 29 19:39:45 localhost zookeeper: WARN 
> [RecvWorker:5:QuorumCnxManager$RecvWorker@1031] - Interrupting SendWorker
> Nov 29 19:39:45 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: last message repeated 7 times
> Nov 29 19:39:45 localhost zookeeper: WARN 
> [SendWorker:5:QuorumCnxManager$SendWorker@949] - Exception when using
> channel: for id 5 my id = 4 error = java.net.SocketException: Socket
> closed
> Nov 29 19:39:45 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: last message repeated 2 times
> Nov 29 19:39:45 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 localhost zookeeper: WARN 
> [SendWorker:5:QuorumCnxManager$SendWorker@954] - Send worker leaving
> thread
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [zookeeper-4-internal.prod.airtime.com/10.1.72.145:3888:QuorumCnxManager$Listener@746]
> - Received connection request /10.1.48.67:58398
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x1
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x41
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x41
> (n.round), LOOKING (n.state), 4 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x41
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:47 localhost zookeeper: INFO 
> [WorkerReceiver[myid=4]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x41
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:51 zookeeper-4 monit[7081]: restart service
> 'zookeeper-server' on user request
> 
> 
> And here are logs from one of the followers:
> 
> Nov 29 19:32:11 localhost zookeeper: INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating
> client: 0x15fc2e9562e0187
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> following the leader
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1044] - Closed
> socket connection for client /10.1.29.33:57934 which had sessionid
> 0x35f3a424dc802a2
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@140] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@505] - shutting
> down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@107] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:CommitProcessor@184] - Shutting
> down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [FollowerRequestProcessor:5:FollowerRequestProcessor@97] -
> FollowerRequestProcessor exited loop!
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@402] -
> shutdown of request processor complete
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [CommitProcessor:5:CommitProcessor@153] - CommitProcessor exited loop!
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@208] -
> Shutting down
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [SyncThread:5:SyncRequestProcessor@186] - SyncRequestProcessor exited!
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:QuorumPeer@865] - LOOKING
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@818] - New
> election. My id =  5, proposed zxid=0x2b0260b5a8
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:12 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 400
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 800
> Nov 29 19:32:12 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:13 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 1600
> Nov 29 19:32:13 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:15 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 3200
> Nov 29 19:32:15 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:18 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 6400
> Nov 29 19:32:18 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:24 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 12800
> Nov 29 19:32:24 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:32:37 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 25600
> Nov 29 19:32:37 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:33:03 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 51200
> Nov 29 19:33:03 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:33:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:33:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:34:42 localhost zookeeper: WARN 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught
> end of stream exception
> Nov 29 19:34:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:34:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:35:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:35:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:36:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:36:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:37:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:37:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:38:54 localhost zookeeper: INFO 
> [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@852] -
> Notification time out: 60000
> Nov 29 19:38:54 localhost zookeeper: INFO 
> [WorkerReceiver[myid=5]:FastLeaderElection@600] - Notification: 1
> (message format version), 5 (n.leader), 0x2b0260b5a8 (n.zxid), 0x40
> (n.round), LOOKING (n.state), 5 (n.sid), 0x2b (n.peerEpoch) LOOKING (my
> state)
> Nov 29 19:39:45 zookeeper-5 monit[7157]: restart service
> 'zookeeper-server' on user request
> 
> 

Mime
View raw message