zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Schmitz (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ZOOKEEPER-2938) Server is unable to join quorum after connection broken to other peers
Date Wed, 12 Jun 2019 00:52:06 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861638#comment-16861638 ] 

Sebastian Schmitz edited comment on ZOOKEEPER-2938 at 6/12/19 12:51 AM:
------------------------------------------------------------------------

Hey guys,

I'm running Zookeeper 3.5.5 with Kafka 2.2.1 in my Dev-Environment and got exactly this problem there.

And as stated above, restarting the leader fixed it. So upgrading to 3.5.5 unfortunately doesn't fix the problem....

 Logs from all nodes with restart of node2, which was the leader:
{code:java}
node 1 (not leader)
2019-06-12 00:37:51,904 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:37:51,904 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 3200
2019-06-12 00:37:55,108 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:37:55,113 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:37:55,113 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 6400
2019-06-12 00:38:01,516 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:38:01,521 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:38:01,521 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 12800
2019-06-12 00:38:14,324 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:38:14,327 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:38:14,327 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 25600
2019-06-12 00:38:39,935 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:38:40,126 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:38:40,126 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 51200
2019-06-12 00:39:31,544 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:39:31,786 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (3, 1)
2019-06-12 00:39:31,786 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 60000
2019-06-12 00:39:46,639 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:46,639 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:47,259 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:48,756 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:50,105 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:50,620 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:51,638 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:53,116 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:53,203 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:54,416 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:55,788 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:55,814 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:57,412 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:58,224 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:39:58,687 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:00,237 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:00,654 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:02,051 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:03,220 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:03,262 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:04,923 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:05,506 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:07,148 [myid:1] - WARN [NIOWorkerThread-3:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:07,497 [myid:1] - WARN [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:09,029 [myid:1] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:47290
2019-06-12 00:40:09,449 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,449 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,452 [myid:1] - INFO [WorkerSender[myid=1]:QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:40:09,457 [myid:1] - INFO [WorkerSender[myid=1]:QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (2, 1)
2019-06-12 00:40:09,501 [myid:1] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:09,676 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,677 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 1 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,677 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,679 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 1 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,841 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:09,880 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1269] - FOLLOWING
2019-06-12 00:40:09,897 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Learner@91] - TCP NoDelay set to: true
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:zookeeper.version=3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on 05/03/2019 12:07 GMT
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:host.name=zookeeper_node_1
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.version=1.8.0_212
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.vendor=Oracle Corporation
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.home=/usr/local/openjdk-8
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.class.path=/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/classes:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../build/classes:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../build/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-jute-3.5.5.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-3.5.5.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-log4j12-1.7.25.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-api-1.7.25.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/netty-all-4.1.29.Final.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/log4j-1.2.17.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/json-simple-1.1.1.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jline-2.11.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-util-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-server-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-security-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-io-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-http-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/javax.servlet-api-3.1.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-databind-2.9.8.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-core-2.9.8.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-annotations-2.9.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/commons-cli-1.2.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/audience-annotations-0.5.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/src/main/resources/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../conf::/opt/zookeeper-cluster/newrelic/newrelic.jar
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.io.tmpdir=/tmp
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.compiler=<NA>
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.name=Linux
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.arch=amd64
2019-06-12 00:40:09,919 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.version=3.10.0-693.21.1.el7.x86_64
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.name=root
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.home=/root
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.dir=/
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.free=290MB
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.max=889MB
2019-06-12 00:40:09,920 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.total=396MB
2019-06-12 00:40:09,923 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@938] - minSessionTimeout set to 4000
2019-06-12 00:40:09,923 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@947] - maxSessionTimeout set to 40000
2019-06-12 00:40:09,923 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@166] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /mnt/zk_data_log/version-2 snapdir /mnt/zk_data/version-2
2019-06-12 00:40:09,924 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Follower@69] - FOLLOWING - LEADER ELECTION TOOK - 43 MS
2019-06-12 00:40:10,184 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Learner@391] - Getting a diff from the leader 0x2d000022ce
2019-06-12 00:40:10,207 [myid:1] - WARN [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Learner@454] - Got zxid 0x2d00000001 expected 0x1
2019-06-12 00:40:10,321 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Learner@546] - Learner received NEWLEADER message
2019-06-12 00:40:10,372 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):Learner@529] - Learner received UPTODATE message
2019-06-12 00:40:10,400 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled):CommitProcessor@256] - Configuring CommitProcessor with 2 worker threads.
2019-06-12 00:40:10,460 [myid:1] - INFO [SyncThread:1:FileTxnLog@216] - Creating new log file: log.2d00000001


node 2 (not leader)
2019-06-12 00:40:09,456 [myid:2] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 1, my id = 2, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:40:09,456 [myid:2] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:40:09,457 [myid:2] - WARN [RecvWorker:2:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 2, my id = 2, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:40:09,457 [myid:2] - WARN [RecvWorker:2:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:40:09,460 [myid:2] - WARN [SendWorker:2:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:40:09,460 [myid:2] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:59918
2019-06-12 00:40:09,461 [myid:2] - WARN [SendWorker:2:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 2 my id = 2
2019-06-12 00:40:09,461 [myid:2] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:44210
2019-06-12 00:40:09,462 [myid:2] - WARN [SendWorker:1:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:40:09,463 [myid:2] - WARN [SendWorker:1:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 1 my id = 2
2019-06-12 00:40:09,464 [myid:2] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 1, my id = 2, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:40:09,464 [myid:2] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:40:09,464 [myid:2] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:59920
2019-06-12 00:40:09,464 [myid:2] - WARN [SendWorker:2:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:40:09,465 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,465 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,467 [myid:2] - WARN [RecvWorker:2:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:40:09,467 [myid:2] - WARN [SendWorker:2:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 2 my id = 2
2019-06-12 00:40:09,588 [myid:2] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:09,691 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,691 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 2 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,691 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 2 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,691 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,892 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1269] - FOLLOWING
2019-06-12 00:40:09,892 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@938] - minSessionTimeout set to 4000
2019-06-12 00:40:09,892 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@947] - maxSessionTimeout set to 40000
2019-06-12 00:40:09,892 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@166] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /mnt/zk_data_log/version-2 snapdir /mnt/zk_data/version-2
2019-06-12 00:40:09,892 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):Follower@69] - FOLLOWING - LEADER ELECTION TOOK - 1 MS
2019-06-12 00:40:09,950 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):Learner@391] - Getting a diff from the leader 0x2d000022ce
2019-06-12 00:40:09,951 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):Learner@546] - Learner received NEWLEADER message
2019-06-12 00:40:10,080 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):Learner@529] - Learner received UPTODATE message
2019-06-12 00:40:10,081 [myid:2] - INFO [QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled):CommitProcessor@256] - Configuring CommitProcessor with 2 worker threads.
2019-06-12 00:40:11,310 [myid:2] - INFO [NIOWorkerThread-2:Learner@113] - Revalidating client: 0x305c08162210001


Node 3 (leader)
2019-06-12 00:38:40,130 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 1, my id = 3, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:38:40,130 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:38:40,130 [myid:3] - WARN [RecvWorker:3:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 3, my id = 3, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:38:40,130 [myid:3] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:33338
2019-06-12 00:38:40,131 [myid:3] - WARN [RecvWorker:3:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:38:40,131 [myid:3] - WARN [SendWorker:3:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:38:40,136 [myid:3] - WARN [SendWorker:3:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 3 my id = 3
2019-06-12 00:38:40,146 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2c0000232e (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x2c (n.peerEPoch), LEADING (my state)0 (n.config version)
2019-06-12 00:38:40,146 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2c0000232e (n.zxid), 0x2 (n.round), LEADING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LEADING (my state)0 (n.config version)
2019-06-12 00:39:31,783 [myid:3] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:42022
2019-06-12 00:39:31,789 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 1, my id = 3, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:39:31,789 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:39:31,789 [myid:3] - WARN [SendWorker:1:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:39:31,789 [myid:3] - WARN [SendWorker:1:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 1 my id = 3
2019-06-12 00:39:31,789 [myid:3] - WARN [RecvWorker:3:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 3, my id = 3, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-06-12 00:39:31,791 [myid:3] - WARN [RecvWorker:3:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-06-12 00:39:31,790 [myid:3] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /x.x.x.x:33348
2019-06-12 00:39:31,792 [myid:3] - WARN [SendWorker:3:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-06-12 00:39:31,794 [myid:3] - WARN [SendWorker:3:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread id 3 my id = 3
2019-06-12 00:39:31,984 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2c0000232e (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x2c (n.peerEPoch), LEADING (my state)0 (n.config version)
2019-06-12 00:39:31,984 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2c0000232e (n.zxid), 0x2 (n.round), LEADING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LEADING (my state)0 (n.config version)
setting dataDir=/mnt/zk_data
setting dataLogDir=/mnt/zk_data_log
setting server.1=node1:2888:3888
setting server.2=node2:2888:3888
setting server.3=0.0.0.0:2888:3888
echo 3 > /mnt/zk_data/myid
setting ZOO_LOG4J_PROP=INFO,CONSOLE
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../conf/zoo.cfg
Jun 12, 2019 00:39:54 +0000 [1 1] com.newrelic INFO: New Relic Agent: Loading configuration file "/opt/zookeeper-cluster/newrelic/./newrelic.yml"
Jun 12, 2019 00:39:54 +0000 [1 1] com.newrelic INFO: Using default collector host: collector.newrelic.com
Jun 12, 2019 00:39:55 +0000 [1 1] com.newrelic INFO: New Relic Agent: Writing to log file: /opt/zookeeper-cluster/newrelic/logs/newrelic_agent.log
2019-06-12 00:40:06,610 [myid:] - INFO [main:QuorumPeerConfig@133] - Reading configuration from: /opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../conf/zoo.cfg
2019-06-12 00:40:06,627 [myid:] - INFO [main:QuorumPeerConfig@385] - clientPortAddress is 0.0.0.0/0.0.0.0:2181
2019-06-12 00:40:06,627 [myid:] - INFO [main:QuorumPeerConfig@389] - secureClientPort is not set
2019-06-12 00:40:06,879 [myid:3] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2019-06-12 00:40:06,880 [myid:3] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2019-06-12 00:40:06,892 [myid:3] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2019-06-12 00:40:06,893 [myid:3] - INFO [main:ManagedUtil@46] - Log4j found with jmx enabled.
2019-06-12 00:40:06,967 [myid:3] - INFO [main:QuorumPeerMain@141] - Starting quorum peer
2019-06-12 00:40:07,020 [myid:3] - INFO [main:ServerCnxnFactory@135] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2019-06-12 00:40:07,041 [myid:3] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 1 selector thread(s), 4 worker threads, and 64 kB direct buffers.
2019-06-12 00:40:07,060 [myid:3] - INFO [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:2181
2019-06-12 00:40:07,407 [myid:3] - INFO [main:Log@193] - Logging initialized @14548ms to org.eclipse.jetty.util.log.Slf4jLog
2019-06-12 00:40:07,864 [myid:3] - WARN [main:ContextHandler@1588] - o.e.j.s.ServletContextHandler@360bc645{/,null,UNAVAILABLE} contextPath ends with /*
2019-06-12 00:40:07,864 [myid:3] - WARN [main:ContextHandler@1599] - Empty contextPath
2019-06-12 00:40:08,096 [myid:3] - INFO [main:X509Util@79] - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2019-06-12 00:40:08,117 [myid:3] - INFO [main:QuorumPeer@1488] - Local sessions disabled
2019-06-12 00:40:08,119 [myid:3] - INFO [main:QuorumPeer@1499] - Local session upgrading disabled
2019-06-12 00:40:08,119 [myid:3] - INFO [main:QuorumPeer@1466] - tickTime set to 2000
2019-06-12 00:40:08,119 [myid:3] - INFO [main:QuorumPeer@1510] - minSessionTimeout set to 4000
2019-06-12 00:40:08,119 [myid:3] - INFO [main:QuorumPeer@1521] - maxSessionTimeout set to 40000
2019-06-12 00:40:08,119 [myid:3] - INFO [main:QuorumPeer@1536] - initLimit set to 10
2019-06-12 00:40:08,140 [myid:3] - INFO [main:ZKDatabase@117] - zookeeper.snapshotSizeFactor = 0.33
2019-06-12 00:40:08,158 [myid:3] - INFO [main:QuorumPeer@1781] - Using insecure (non-TLS) quorum communication
2019-06-12 00:40:08,158 [myid:3] - INFO [main:QuorumPeer@1787] - Port unification disabled
2019-06-12 00:40:08,158 [myid:3] - INFO [main:QuorumPeer@2154] - QuorumPeer communication is not secured! (SASL auth disabled)
2019-06-12 00:40:08,158 [myid:3] - INFO [main:QuorumPeer@2183] - quorum.cnxn.threads.size set to 20
2019-06-12 00:40:08,165 [myid:3] - INFO [main:FileSnap@83] - Reading snapshot /mnt/zk_data/version-2/snapshot.2c0000232e
2019-06-12 00:40:08,602 [myid:3] - WARN [NIOWorkerThread-1:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:08,613 [myid:3] - INFO [main:Server@370] - jetty-9.4.17.v20190418; built: 2019-04-18T19:45:35.259Z; git: aa1c656c315c011c01e7b21aabb04066635b9f67; jvm 1.8.0_212-b04
2019-06-12 00:40:08,618 [myid:3] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-06-12 00:40:08,746 [myid:3] - INFO [main:DefaultSessionIdManager@365] - DefaultSessionIdManager workerName=node0
2019-06-12 00:40:08,746 [myid:3] - INFO [main:DefaultSessionIdManager@370] - No SessionScavenger set, using defaults
2019-06-12 00:40:08,754 [myid:3] - INFO [main:HouseKeeper@149] - node0 Scavenging every 660000ms
2019-06-12 00:40:08,907 [myid:3] - INFO [main:ContextHandler@855] - Started o.e.j.s.ServletContextHandler@360bc645{/,null,AVAILABLE}
2019-06-12 00:40:08,936 [myid:3] - INFO [main:AbstractConnector@292] - Started ServerConnector@7afb1741{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-06-12 00:40:08,936 [myid:3] - INFO [main:Server@410] - Started @16078ms
2019-06-12 00:40:08,936 [myid:3] - INFO [main:JettyAdminServer@112] - Started AdminServer on address 0.0.0.0, port 8080 and command URL /commands
2019-06-12 00:40:08,952 [myid:3] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@881] - My election bind port: /0.0.0.0:3888
2019-06-12 00:40:09,011 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1193] - LOOKING
2019-06-12 00:40:09,014 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@885] - New election. My id = 3, proposed zxid=0x2d000022ce
2019-06-12 00:40:09,046 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,276 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 2 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,277 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 1 (n.leader), 0x2c0000232e (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2c (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,277 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@919] - Notification time out: 400
2019-06-12 00:40:09,281 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,281 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,282 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 3 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,458 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,463 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 1 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,470 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x2d000022ce (n.zxid), 0x3 (n.round), LOOKING (n.state), 2 (n.sid), 0x2d (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-06-12 00:40:09,671 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1281] - LEADING
2019-06-12 00:40:09,676 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Leader@66] - TCP NoDelay set to: true
2019-06-12 00:40:09,676 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Leader@86] - zookeeper.leader.maxConcurrentSnapshots = 10
2019-06-12 00:40:09,676 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Leader@88] - zookeeper.leader.maxConcurrentSnapshotTimeout = 5
2019-06-12 00:40:09,712 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:zookeeper.version=3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on 05/03/2019 12:07 GMT
2019-06-12 00:40:09,712 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:host.name=zookeeper_node_3
2019-06-12 00:40:09,712 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.version=1.8.0_212
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.vendor=Oracle Corporation
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.home=/usr/local/openjdk-8
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.class.path=/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/classes:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../build/classes:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../build/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-jute-3.5.5.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-3.5.5.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-log4j12-1.7.25.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-api-1.7.25.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/netty-all-4.1.29.Final.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/log4j-1.2.17.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/json-simple-1.1.1.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jline-2.11.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-util-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-server-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-security-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-io-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-http-9.4.17.v20190418.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/javax.servlet-api-3.1.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-databind-2.9.8.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-core-2.9.8.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-annotations-2.9.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/commons-cli-1.2.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../lib/audience-annotations-0.5.0.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/src/main/resources/lib/*.jar:/opt/zookeeper-cluster/apache-zookeeper-3.5.5-bin/bin/../conf::/opt/zookeeper-cluster/newrelic/newrelic.jar
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.io.tmpdir=/tmp
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:java.compiler=<NA>
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.name=Linux
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.arch=amd64
2019-06-12 00:40:09,713 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.version=3.10.0-693.21.1.el7.x86_64
2019-06-12 00:40:09,722 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.name=root
2019-06-12 00:40:09,723 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.home=/root
2019-06-12 00:40:09,723 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:user.dir=/
2019-06-12 00:40:09,723 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.free=106MB
2019-06-12 00:40:09,723 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.max=889MB
2019-06-12 00:40:09,723 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Environment@109] - Server environment:os.memory.total=253MB
2019-06-12 00:40:09,734 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@938] - minSessionTimeout set to 4000
2019-06-12 00:40:09,735 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@947] - maxSessionTimeout set to 40000
2019-06-12 00:40:09,736 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@166] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /mnt/zk_data_log/version-2 snapdir /mnt/zk_data/version-2
2019-06-12 00:40:09,743 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Leader@464] - LEADING - LEADER ELECTION TOOK - 72 MS
2019-06-12 00:40:09,750 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):FileTxnSnapLog@372] - Snapshotting: 0x2d000022ce to /mnt/zk_data/version-2/snapshot.2d000022ce
2019-06-12 00:40:09,904 [myid:3] - INFO [LearnerHandler-/x.x.x.x:42794:LearnerHandler@406] - Follower sid: 2 : info : node2:2888:3888:participant
2019-06-12 00:40:09,943 [myid:3] - INFO [LearnerHandler-/x.x.x.x:42794:ZKDatabase@295] - On disk txn sync enabled with snapshotSizeFactor 0.33
2019-06-12 00:40:09,944 [myid:3] - INFO [LearnerHandler-/x.x.x.x:42794:LearnerHandler@708] - Synchronizing with Follower sid: 2 maxCommittedLog=0x2d000022ce minCommittedLog=0x2d000020da lastProcessedZxid=0x2d000022ce peerLastZxid=0x2d000022ce
2019-06-12 00:40:09,944 [myid:3] - INFO [LearnerHandler-/x.x.x.x:42794:LearnerHandler@752] - Sending DIFF zxid=0x2d000022ce for peer sid: 2
2019-06-12 00:40:09,955 [myid:3] - INFO [LearnerHandler-/x.x.x.x:48552:LearnerHandler@406] - Follower sid: 1 : info : node1:2888:3888:participant
2019-06-12 00:40:09,967 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Leader@1296] - Have quorum of supporters, sids: [ [2, 3],[2, 3] ]; starting up and setting last processed zxid: 0x2e00000000
2019-06-12 00:40:09,989 [myid:3] - INFO [LearnerHandler-/x.x.x.x:48552:ZKDatabase@295] - On disk txn sync enabled with snapshotSizeFactor 0.33
2019-06-12 00:40:09,989 [myid:3] - INFO [LearnerHandler-/x.x.x.x:48552:LearnerHandler@708] - Synchronizing with Follower sid: 1 maxCommittedLog=0x2d000022ce minCommittedLog=0x2d000020da lastProcessedZxid=0x2d000022ce peerLastZxid=0x2c0000232e
2019-06-12 00:40:09,992 [myid:3] - INFO [LearnerHandler-/x.x.x.x:48552:LearnerHandler@784] - Use txnlog and committedLog for peer sid: 1
2019-06-12 00:40:09,994 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):CommitProcessor@256] - Configuring CommitProcessor with 2 worker threads.
2019-06-12 00:40:10,072 [myid:3] - INFO [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):ContainerManager@64] - Using checkIntervalMs=60000 maxPerMinute=10000
2019-06-12 00:40:10,077 [myid:3] - INFO [LearnerHandler-/x.x.x.x:48552:LearnerHandler@859] - Sending DIFF zxid=0x2d000022ce for peer sid: 1

{code}


was (Author: paxi):
Hey guys,

I'm running Zookeeper 3.5.5 with Kafka 2.2.1 in my Dev-Environment and got exactly this problem there.

And as stated above, restarting the leader fixed it. So upgrading to 3.5.5 unfortunately doesn't fix the problem....

 

> Server is unable to join quorum after connection broken to other peers
> ----------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2938
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2938
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.6
>            Reporter: Abhay Bothra
>            Priority: Major
>
> We see the following logs in the node with {{myid: 1}}
> {code}
> 2017-11-08 15:06:28,375 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (2, 1)
> 2017-11-08 15:06:28,375 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (3, 1)
> 2017-11-08 15:07:28,375 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x28e000a8750 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x28e (n.peerEpoch) LOOKING (my state)
> 2017-11-08 15:07:28,375 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (2, 1)
> 2017-11-08 15:07:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (3, 1)
> 2017-11-08 15:08:28,375 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x28e000a8750 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x28e (n.peerEpoch) LOOKING (my state)
> 2017-11-08 15:08:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (2, 1)
> 2017-11-08 15:08:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (3, 1)
> 2017-11-08 15:09:28,376 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x28e000a8750 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x28e (n.peerEpoch) LOOKING (my state)
> 2017-11-08 15:09:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (2, 1)
> 2017-11-08 15:09:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (3, 1)
> 2017-11-08 15:10:28,376 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x28e000a8750 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x28e (n.peerEpoch) LOOKING (my state)
> 2017-11-08 15:10:28,376 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (2, 1)
> 2017-11-08 15:10:28,377 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@193] - Have smaller server identifier, so dropping the connection: (3, 1)
> {code}
> On the nodes with {{myid: 2}} and {{myid: 3}}, we see connection broken events for {{myid: 1}}
> {code}
> 2017-11-07 02:54:32,135 [myid:2] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@780] - Connection broken for id 1, my id = 2, error =
> java.net.SocketException: Connection reset
>         at java.net.SocketInputStream.read(SocketInputStream.java:209)
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>         at java.net.SocketInputStream.read(SocketInputStream.java:223)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)
> 2017-11-07 02:54:32,135 [myid:2] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@783] - Interrupting SendWorker
> 2017-11-07 02:54:32,135 [myid:2] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@697] - Interrupted while waiting for message on queue
> java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>         at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)
> 2017-11-07 02:54:32,135 [myid:2] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@706] - Send worker leaving thread
> {code}
> From the reported occurrences, it looks like this is a problem only when the node with the smallest {{myid}} loses connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message