zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filip Deleersnijder <fi...@motum.be>
Subject Re: Leader election problems
Date Thu, 25 Jun 2015 15:12:16 GMT
Hi,

Some more information that we just got from our installation.

- To check the EOF theory, we cleared all the version-2 data directories and restarted all
vehicles
- 2 of the 8 vehicles were unavailable at that time
- The remaining 6 vehicles did not succeed in finishing leader election phase !

A log extract can be found at the end of this e-mail.

- We then removed the 2 vehicles from the configuration just for testing purposes
- Now leader election succeed without any problems !

Any insights are very helpful !

Filip

Logging during failing leader election phase : 

2015-06-25 16:29:30,965 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65133 (no session established for client)
2015-06-25 16:29:31,171 [myid:8] - WARN  [WorkerSender[myid=8]:QuorumCnxManager@382] - Cannot
open channel to 5 at election address /172.17.35.15:3888
java.net.SocketTimeoutException: connect timed out
	at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
	at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:31,189 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@364] - Opening
channel to server 6
2015-06-25 16:29:31,464 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65145
2015-06-25 16:29:31,466 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:31,467 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:31,467 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65145 (no session established for client)
2015-06-25 16:29:32,370 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65163
2015-06-25 16:29:32,372 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:32,374 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:32,374 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65163 (no session established for client)
2015-06-25 16:29:32,585 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65167
2015-06-25 16:29:32,587 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:32,588 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:32,589 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65167 (no session established for client)
2015-06-25 16:29:32,800 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65173
2015-06-25 16:29:32,805 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:32,809 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:32,810 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65173 (no session established for client)
2015-06-25 16:29:33,543 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65190
2015-06-25 16:29:33,544 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:33,546 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:33,546 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65190 (no session established for client)
2015-06-25 16:29:33,983 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65200
2015-06-25 16:29:33,985 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:33,986 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:33,986 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65200 (no session established for client)
2015-06-25 16:29:34,527 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65212
2015-06-25 16:29:34,529 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:34,530 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:34,530 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65212 (no session established for client)
2015-06-25 16:29:35,108 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:35,109 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x1ae0000019c (n.zxid), 0x2e (n.round),
LOOKING (n.state), 4 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:35,111 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 2, proposed id: 8, zxid: 0x1ae0000019c, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:35,113 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=4, proposed leader=2, proposed zxid=0x1ae0000019c, proposed election epoch=0x2e
2015-06-25 16:29:35,113 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:35,113 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 8 (n.leader), 0x1ae0000019c (n.zxid), 0x2e (n.round),
LOOKING (n.state), 2 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:35,115 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 8, proposed id: 8, zxid: 0x1ae0000019c, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:35,115 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=2, proposed leader=8, proposed zxid=0x1ae0000019c, proposed election epoch=0x2e
2015-06-25 16:29:35,333 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65229
2015-06-25 16:29:35,335 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:35,336 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:35,336 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65229 (no session established for client)
2015-06-25 16:29:35,369 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:35,369 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 2 (n.leader), 0x1ae0000019c (n.zxid), 0x2e (n.round),
LOOKING (n.state), 1 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:35,371 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 2, proposed id: 8, zxid: 0x1ae0000019c, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:35,371 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=1, proposed leader=2, proposed zxid=0x1ae0000019c, proposed election epoch=0x2e
2015-06-25 16:29:35,776 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65239
2015-06-25 16:29:35,779 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:35,780 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:35,781 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65239 (no session established for client)
2015-06-25 16:29:36,052 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:36,052 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 8 (n.leader), 0x1ae0000019c (n.zxid), 0x2e (n.round),
LOOKING (n.state), 3 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:36,054 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 8, proposed id: 8, zxid: 0x1ae0000019c, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:36,054 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=3, proposed leader=8, proposed zxid=0x1ae0000019c, proposed election epoch=0x2e
2015-06-25 16:29:36,064 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:36,064 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 7 (n.leader), 0x1ae00000196 (n.zxid), 0x2e (n.round),
LOOKING (n.state), 7 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:36,066 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 7, proposed id: 8, zxid: 0x1ae00000196, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:36,066 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=7, proposed leader=7, proposed zxid=0x1ae00000196, proposed election epoch=0x2e
2015-06-25 16:29:36,189 [myid:8] - WARN  [WorkerSender[myid=8]:QuorumCnxManager@382] - Cannot
open channel to 6 at election address /172.17.35.16:3888
java.net.SocketTimeoutException: connect timed out
	at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
	at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@387] - There
is a connection already for server 7
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@387] - There
is a connection already for server 1
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@387] - There
is a connection already for server 2
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@387] - There
is a connection already for server 3
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@387] - There
is a connection already for server 4
2015-06-25 16:29:36,198 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@364] - Opening
channel to server 5
2015-06-25 16:29:36,200 [myid:8] - DEBUG [WorkerReceiver[myid=8]:FastLeaderElection$Messenger$WorkerReceiver@270]
- Receive new notification message. My id = 8
2015-06-25 16:29:36,201 [myid:8] - INFO  [WorkerReceiver[myid=8]:FastLeaderElection@597] -
Notification: 1 (message format version), 8 (n.leader), 0x1ae0000019c (n.zxid), 0x2e (n.round),
LOOKING (n.state), 8 (n.sid), 0x1ae (n.peerEpoch) LOOKING (my state)
2015-06-25 16:29:36,215 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@609]
- id: 8, proposed id: 8, zxid: 0x1ae0000019c, proposed zxid: 0x1ae0000019c
2015-06-25 16:29:36,215 [myid:8] - DEBUG [QuorumPeer[myid=8]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@885]
- Adding vote: from=8, proposed leader=8, proposed zxid=0x1ae0000019c, proposed election epoch=0x2e
2015-06-25 16:29:36,309 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65251
2015-06-25 16:29:36,315 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:36,344 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:36,345 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65251 (no session established for client)
2015-06-25 16:29:36,369 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65254
2015-06-25 16:29:36,371 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:36,372 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:36,373 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65254 (no session established for client)
2015-06-25 16:29:38,053 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65288
2015-06-25 16:29:38,056 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:38,057 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:38,058 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65288 (no session established for client)
2015-06-25 16:29:38,328 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65294
2015-06-25 16:29:38,330 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:38,331 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:38,332 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65294 (no session established for client)
2015-06-25 16:29:38,679 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65302
2015-06-25 16:29:38,686 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:38,688 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:38,689 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65302 (no session established for client)
2015-06-25 16:29:39,385 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65318
2015-06-25 16:29:39,393 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:39,400 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:39,401 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65318 (no session established for client)
2015-06-25 16:29:39,973 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65331
2015-06-25 16:29:39,975 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:39,977 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:39,977 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65331 (no session established for client)
2015-06-25 16:29:40,510 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65343
2015-06-25 16:29:40,512 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:40,514 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:40,514 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /127.0.0.1:65343 (no session established for client)
2015-06-25 16:29:40,973 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65352
2015-06-25 16:29:40,975 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:40,977 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:40,977 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65352 (no session established for client)
2015-06-25 16:29:41,200 [myid:8] - WARN  [WorkerSender[myid=8]:QuorumCnxManager@382] - Cannot
open channel to 5 at election address /172.17.35.15:3888
java.net.SocketTimeoutException: connect timed out
	at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
	at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:41,211 [myid:8] - DEBUG [WorkerSender[myid=8]:QuorumCnxManager@364] - Opening
channel to server 6
2015-06-25 16:29:41,564 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /0:0:0:0:0:0:0:1:65364
2015-06-25 16:29:41,566 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:41,567 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace
java.io.IOException: ZooKeeperServer not running
	at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:931)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:237)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-06-25 16:29:41,567 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /0:0:0:0:0:0:0:1:65364 (no session established for client)
2015-06-25 16:29:42,448 [myid:8] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /127.0.0.1:65384
2015-06-25 16:29:42,465 [myid:8] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-06-25 16:29:42,467 [myid:8] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@366]
- IOException stack trace

> On 25 Jun 2015, at 16:34, Filip Deleersnijder <filip@motum.be> wrote:
> 
> Hi,
> 
> I can see that all of our logs contain the following log-statements pretty often.
> 
> 2015-06-22 12:02:00,752 [myid:2] - DEBUG [main:DataTree@949][] - Ignoring processTxn
failure hdr: -1 : error: -2
> 2015-06-22 12:02:00,753 [myid:2] - DEBUG [main:DataTree@949][] - Ignoring processTxn
failure hdr: 14 : error: -101
> 
> 2015-06-25 14:02:39,505 [myid:3] - DEBUG [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileTxnLog$FileTxnIterator@636]
- EOF excepton java.io.EOFException: Failed to read c:\motum\config\MASS\ZK\version-2\log.1aa00000001
> Since we don’t properly shut the ZK process down ( we just shutdown windows ), this
properly can cause corruption of files.
> 
> Is there somebody that has a clear idea about whether the “EOF” or the “Ignoring
processTxn” problems could cause frequent and long during Leader Elections ?
> 
> Any help is greatly appreciated,
> 
> Filip
> 
> 
> 
>> On 25 Jun 2015, at 11:51, Guy Moshkowich <guy.moshkowich@gmail.com <mailto:guy.moshkowich@gmail.com>>
wrote:
>> 
>> Are you using ZK client on your vehicles or ZK servers?
>> You mentioned below 8 vehicles and i see 8 servers defined in the config.
>> I would expect you have 8 client(running on your vehicles) communicating
>> against 1 or 3 ZK servers as this will be more than enough for 8 clients.
>> Guy
>> 
>> On Thursday, June 25, 2015, Filip Deleersnijder <filip@motum.be <mailto:filip@motum.be>>
wrote:
>> 
>>> Hi,
>>> 
>>> Thanks for your response.
>>> 
>>> Our application consists of 8 automatic vehicles in a warehouse setting.
>>> Those vehicles need some consensus decisions, and that is what we use
>>> Zookeeper for.
>>> Because vehicles can come and go at random, we installed a ZK participant
>>> on every vehicle. The ZK client is some other piece of software that is
>>> also running on the vehicles.
>>> 
>>> Therefor :
>>>        - We can not choose the number of ZK-participants because it just
>>> depends on the number of vehicles.
>>>        - The participants communicate over Wifi
>>>        - The client is running on the same machine, so it communicates
>>> over the local network
>>> 
>>> We are running Zookeeper version 3.4.6
>>> 
>>> Our zoo.cfg can be found below this e-mail.
>>> 
>>> Thanks in advance !
>>> 
>>> Filip
>>> 
>>> # The number of milliseconds of each tick
>>> tickTime=2000
>>> # The number of ticks that the initial
>>> # synchronization phase can take
>>> initLimit=10
>>> # The number of ticks that can pass between
>>> # sending a request and getting an acknowledgement
>>> syncLimit=5
>>> # the directory where the snapshot is stored.
>>> # do not use /tmp for storage, /tmp here is just
>>> # example sakes.
>>> dataDir=c:/motum/config/MASS/ZK
>>> # the port at which the clients will connect
>>> clientPort=2181
>>> 
>>> server.1=172.17.35.11:2888:3888
>>> server.2=172.17.35.12:2888:3888
>>> server.3=172.17.35.13:2888:3888
>>> server.4=172.17.35.14:2888:3888
>>> server.5=172.17.35.15:2888:3888
>>> server.6=172.17.35.16:2888:3888
>>> server.7=172.17.35.17:2888:3888
>>> server.8=172.17.35.18:2888:3888
>>> 
>>> # The number of snapshots to retain in dataDir
>>> # Purge task interval in hours
>>> # Set to "0" to disable auto purge feature
>>> autopurge.snapRetainCount=3
>>> autopurge.purgeInterval=1
>>> 
>>> 
>>> 
>>>> On 24 Jun 2015, at 18:54, Raúl Gutiérrez Segalés <rgs@itevenworks.net
<mailto:rgs@itevenworks.net>
>>> <javascript:;>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> On 24 June 2015 at 06:05, Filip Deleersnijder <filip@motum.be <mailto:filip@motum.be>
>>> <javascript:;>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Let’s start with some description of our system :
>>>>> 
>>>>> - We our using a Zookeeper cluster with 8 participants for an
>>> application
>>>>> with mobile nodes ( connected over Wifi ).
>>>>> 
>>>> 
>>>> You mean the participants talk over wifi or the clients?
>>>> 
>>>> 
>>>>> ( Ip of the different nodes are according to the following structure
:
>>>>> Node X has IP : 172.17.35.1X )
>>>>> 
>>>> 
>>>> Why 8 and not an odd number of machines (i.e.:
>>>> 
>>> http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkMulitServerSetup
<http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkMulitServerSetup>
>>>> )?
>>>> 
>>>> - It is not that unusual to have a node being shut-down or restarted
>>>>> - We haven’t benchmarked the number of write operations yet, but I
would
>>>>> estimate that it would be less than 10 writes / second
>>>>> 
>>>> 
>>>> What version of ZK are you using?
>>>> 
>>>> 
>>>>> 
>>>>> The problem we are having however is that sometimes(*), some instances
>>>>> seem to be having problems with leader election.
>>>>> Under the header “Attachment 1” below, you can find the leader election
>>>>> times that were needed over 24h ( from 1 node ).  One average it took
>>> more
>>>>> than 1 minute !
>>>>> I assume that this is not normal behaviour ? ( If somebody could confirm
>>>>> that in a 8-node cluster, these are not normal leader election times,
>>> that
>>>>> would be nice )
>>>>> 
>>>>> In attachement 2 : I included an extract from the logging during a
>>> leader
>>>>> election that took 101874ms for 1 node ( server 2 ).
>>>>> 
>>>>> Any help is greatly appreciated.
>>>>> If further or more specific logging is required, please ask !
>>>>> 
>>>>> 
>>>> Do you mind sharing a copy of your config file (zoo.cfg)? Thanks!
>>>> 
>>>> 
>>>> -rgs
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message