zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Round, Mark" <Mark.Ro...@sky.uk>
Subject Tracking down possible network partition
Date Thu, 25 Jun 2015 14:28:43 GMT
I have a 5-node Zookeeper 3.4.6 cluster across 3 data centres (2 zookeepers in each “main”
DC, and a 5th in a 3rd DC for quorum). I see that the two nodes in one DC have regular “issues”
where they get kicked out of the cluster and the ZooKeeperServer process stops for a few minutes
until the node rejoins. I’d like to know a couple of things, if someone could please point
me in the direction of the relevant docs I’d greatly appreciate it.

1.) Is it expected behaviour that when a node is kicked from the cluster, it will not be allowed
to re-join for a period ? From the logs below I can see that re-establishing a valid cluster
took around 15 minutes.
2.) It appears that the leader closes connections to the affected followers after a “transaction
timeout” occurs. Where would I find out what this timeout is ? Is this the same thing as
a session timout (e.g. The default of 20 * tickTime) ?
3.) Where can I find the definition of the different fields in the election log messages (I.e.
What are “n.round”, “n.zxid”, “n.state” and so on) ? I can make an educated guess,
which seems to point towards a possible network partition event (see below). This should help
me start to narrow down the issue. For reference, below are some extracts from the relevant
logs:

On the leader, I see a message similar to the following :

2015-06-24 04:18:40,152 [myid:5] - WARN  [QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687]
- Closing connection to peer due to transaction timeout.
2015-06-24 04:18:40,152 [myid:5] - WARN  [LearnerHandler-/10.93.167.74:36874:LearnerHandler@646]
- ******* GOODBYE /10.93.167.74:36874 ********
2015-06-24 04:18:40,153 [myid:5] - WARN  [LearnerHandler-/10.93.167.74:36874:LearnerHandler@658]
- Ignoring unexpected exception
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
        at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
        at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
        at org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:656)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:649)

And on the affected follower :

2015-06-24 04:18:40,160 [myid:3] - WARN  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89]
- Exception when following the leader
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
        at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2015-06-24 04:18:40,160 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@166]
- shutdown called
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)

Following this, the follower has a lot of the following :

2015-06-24 04:26:24,615 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:42079
2015-06-24 04:26:24,616 [myid:3] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running


And a few election messages. These are interesting as (from a guess of what the fields are
in these messages), I see what looks like a network partition issue :

2015-06-24 04:26:27,689 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 3 (n.leader), 0x10009b227 (n.zxid), 0x2 (n.round),
LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:26:27,691 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 4 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)


Assuming that “n.leader” means what this zookeeper thinks the leader is, it looks like
there are two leaders – 3 (the affected follower) and 5 (the actual leader that the rest
of the cluster recognises) . And a little while later on the follower, I see a successful
election. It looks like there’s also some shaking out of quorum here – almost as if it’s
trying to recover from a near split-brain scenario (of course, it’s possible my interpretation
of these logs is way off…):


2015-06-24 04:34:27,707 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849]
- Notification time out: 60000
2015-06-24 04:34:27,712 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 3 (n.leader), 0x10009b227 (n.zxid), 0x2 (n.round),
LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,714 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 4 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,719 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,719 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,721 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,722 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING
(n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,728 [myid:3] - INFO  [WorkerSender[myid=3]:QuorumCnxManager@193] - Have
smaller server identifier, so dropping the connection: (5, 3)
2015-06-24 04:34:27,762 [myid:3] - INFO  [nftserentzk01.slu.skycdc.com/10.93.167.74:3888:QuorumCnxManager$Listener@511]
- Received connection request /10.105.6.117:38358
2015-06-24 04:34:27,764 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@597] -
Notification: 1 (message format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LEADING
(n.state), 5 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,765 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@784]
- FOLLOWING
2015-06-24 04:34:27,765 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162]
- Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir
/zkdata/version-2 snapdir /zkdata/version-2
2015-06-24 04:34:27,765 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@63]
- FOLLOWING - LEADER ELECTION TOOK - 943889


Are my assumptions correct here, or am I barking up the wrong tree ?

Many thanks in advance,

Mark Dastmalchi-Round
DevOps Engineer,  Sky Search
BSkyB | Great West House | GW1 - 4th floor

Information in this email including any attachments may be privileged, confidential and is
intended exclusively for the addressee. The views expressed may not be official policy, but
the personal views of the originator. If you have received it in error, please notify the
sender by return e-mail and delete it from your system. You should not reproduce, distribute,
store, retransmit, use or disclose its contents to anyone. Please note we reserve the right
to monitor all e-mail communication through our internal and external networks. SKY and the
SKY marks are trademarks of Sky plc and Sky International AG and are used under licence. Sky
UK Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075)
and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries
of Sky plc (Registration No. 2247735). All of the companies mentioned in this paragraph are
incorporated in England and Wales and share the same registered office at Grant Way, Isleworth,
Middlesex TW7 5QD.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message