zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Wassermann <baruc...@hotmail.com>
Subject RE: All nodes shutting down simultaneously
Date Mon, 25 Jan 2016 08:24:19 GMT
Hi Flavio,
The leader was indeed the server that got powered off (server 4).
The configuration of the servers in this cluster is as follows.tickTime=2000dataDir=/var/lib/zookeeperclientPort=2181snapCount=1000000initLimit=10syncLimit=5autopurge.purgeInterval=2autopurge.snapRetainCount=4
The nodes were engaged in leader election for almost 90 seconds. I see messages such as the
following:2016-01-17 19:32:17,081 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166]
- shutdown called....2016-01-17 19:32:35,920 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 3 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
LOOKING (n.state), 3 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)....2016-01-17 19:33:41,709
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), LOOKING (n.state), 5 (n.sid),
0x3 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:41,710 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
LOOKING (n.state), 3 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)....2016-01-17 19:33:43,602
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), LOOKING (n.state), 2 (n.sid),
0x3 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:43,602 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
LOOKING (n.state), 2 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)....2016-01-17 19:33:46,575
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 1 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), LOOKING (n.state), 1 (n.sid),
0x3 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,575 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 3 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
LOOKING (n.state), 1 (n.sid), 0x3 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,575
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), LOOKING (n.state), 1 (n.sid),
0x3 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
FOLLOWING (n.state), 2 (n.sid), 0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), FOLLOWING (n.state), 2
(n.sid), 0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
LEADING (n.state), 5 (n.sid), 0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), LEADING (n.state), 5 (n.sid),
0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597]
- Notification: 1 (message format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round),
FOLLOWING (n.state), 3 (n.sid), 0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576
[myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 5 (n.leader), 0x3000f3404 (n.zxid), 0x4 (n.round), FOLLOWING (n.state), 3
(n.sid), 0x4 (n.peerEpoch) LOOKING (my state)2016-01-17 19:33:46,576 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@784]
- FOLLOWING2016-01-17 19:33:46,576 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@63]
- FOLLOWING - LEADER ELECTION TOOK - 894892016-01-17 19:33:46,580 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@323]
- Getting a diff from the leader 0x4000000192016-01-17 19:33:46,580 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@374]
- Got zxid 0x400000001 expected 0x12016-01-17 19:33:46,581 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240]
- Snapshotting: 0x400000019 to /var/lib/zookeeper/version-2/snapshot.400000019

At the time of the incident, the amount of data held was about 10.2 MB in about 1,800 znodes.
The size of the snapshot closest to the time of the incident was 18 MB and the size of the
transaction log on that day was 12 GB. Should there be something in the log files about the
initLimit being reached?
If the long election time was really due to a large snapshot, we could decrease the snapCount
and increase both the initLimit and syncLimit. Is this correct? 
Thanks, 
-- Bruno
> Subject: Re: All nodes shutting down simultaneously
> From: fpj@apache.org
> Date: Sun, 24 Jan 2016 14:19:43 +0000
> CC: brunow@il.ibm.com
> To: user@zookeeper.apache.org
> 
> Hi Bruno,
> 
> If the leader crashes, then the followers are supposed to start complaining around the
same time. Was the server that powered down the leader of the ensemble?
> 
> Another leader election shouldn't take two minutes, so you could check the logs for what
happened around the time the ensemble wasn't able to get a leader established. Were election
notifications being exchanged regularly? Was the size of the snapshot too large and the init
limit was kicking in?
> 
> When you say that the servers shut themselves down, I suppose you're referring to the
fact that you've seen a message in the logs saying it. The server process remained up and
they haven't exited, yes?
> 
> -Flavio
> 
> > On 24 Jan 2016, at 12:05, Bruno Wassermann <baruch_w@hotmail.com> wrote:
> > 
> > Hi I have recently come across something I have not seen before: all 5 nodes of
a ZK cluster shutting themselves down at around the same time.  All servers worked fine for
a long time. Ranging from 2016-01-17 19:31 to 19:33:xx, the servers complained about not being
able to follow the leader and the zk processes shut themselves down. One of the servers (actual
host as in bare-metal) was somehow powered off at 19:33.  The ZK cluster was down for approximately
two minutes. Leader election took 90 seconds. The leader then expired more than 100 client
connections. A snippet from the log file of one of the nodes:2016-01-17 19:32:17,081 [myid:1]
- WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when following the
leaderjava.net.SocketTimeoutException: Read timed out    at java.net.SocketInputStream.socketRead0(Native
Method)    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)    at java.net.SocketInputStream.read(SocketInputStream.java:170)
   at java.net.SocketInputStream.read(SocketInputStream.java:141)    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:265)    at java.io.DataInputStream.readInt(DataInputStream.java:387)
   at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)    at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
   at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)    at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
   at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)2016-01-17
19:32:17,081 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
calledjava.lang.Exception: shutdown Follower    at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
   at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)Followed by closing
client connections. I have analysed garbage collection behaviour on these five nodes and found
that on that day two of the nodes underwent one GC collection of 80 ms each.  What does the
read time out indicate? Does this mean that there was a network issue or was there maybe too
much data to sync on in a timely manner? Why would all nodes shut down at pretty much the
same time?   All the best,  -- Bruno 		 	   		  
> 
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message