zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akmal Abbasov <akmal.abba...@icloud.com>
Subject Unstable work of zookeeper
Date Thu, 24 Sep 2015 12:24:50 GMT
Hi,
I am using zookeeper 3.4.6
I have a spark cluster configured with HA. Once per 1-2 days, the active spark master is shutting
down with a message
15/09/23 18:58:18 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid
0x34ffa68dbd10021, likely server has closed socket, closing socket connection and attempting
reconnect
15/09/23 18:58:18 INFO state.ConnectionStateManager: State change: SUSPENDED
15/09/23 18:58:18 INFO master.ZooKeeperLeaderElectionAgent: We have lost leadership
15/09/23 18:58:18 ERROR master.Master: Leadership has been revoked -- master shutting down.
15/09/23 18:58:18 INFO util.Utils: Shutdown hook called

I don’t have the zookeeper logs from the same period, but the logs are full of the these
messages 
2015-09-24 05:07:42,228 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /10.0.8.4:34705
2015-09-24 05:07:42,229 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822]
- Connection request from old client /10.0.8.4:34705; will be dropped if server is in r-o
mode
2015-09-24 05:07:42,229 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868]
- Client attempting to establish new session at /10.0.8.4:34705
2015-09-24 05:07:42,292 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@617] - Established
session 0x14ffd3670130030 with negotiated timeout 20001 for client /10.0.8.4:34705
2015-09-24 05:07:42,302 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357]
- caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x14ffd3670130030,
likely client has closed socket
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-09-24 05:07:42,303 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /10.0.8.4:34705 which had sessionid 0x14ffd3670130030
2015-09-24 05:07:42,314 [myid:1] - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected
Exception:
java.nio.channels.CancelledKeyException
	at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
	at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
	at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
	at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
	at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
	at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2015-09-24 05:07:42,334 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /10.0.8.4:34707
2015-09-24 05:07:42,334 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822]
- Connection request from old client /10.0.8.4:34707; will be dropped if server is in r-o
mode
2015-09-24 05:07:42,335 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868]
- Client attempting to establish new session at /10.0.8.4:34707
2015-09-24 05:07:42,357 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@617] - Established
session 0x14ffd3670130031 with negotiated timeout 20001 for client /10.0.8.4:34707
2015-09-24 05:07:42,364 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357]
- caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x14ffd3670130031,
likely client has closed socket
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745)
2015-09-24 05:07:42,365 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /10.0.8.4:34707 which had sessionid 0x14ffd3670130031
2015-09-24 05:07:42,376 [myid:1] - ERROR [CommitProcessor:1:NIOServerCnxn@178] - Unexpected
Exception:
java.nio.channels.CancelledKeyException
	at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
	at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
	at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
	at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
	at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
	at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)

Also there are 
2015-09-24 06:29:54,459 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139]
- Shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@441]
- shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@105]
- Shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  [FollowerRequestProcessor:1:FollowerRequestProcessor@95]
- FollowerRequestProcessor exited loop!
2015-09-24 06:29:54,460 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181]
- Shutting down
2015-09-24 06:29:54,464 [myid:1] - INFO  [CommitProcessor:1:CommitProcessor@150] - CommitProcessor
exited loop!
2015-09-24 06:29:54,465 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415]
- shutdown of request processor complete
2015-09-24 06:29:54,466 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@209]
- Shutting down
2015-09-24 06:29:54,466 [myid:1] - INFO  [SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor
exited!
2015-09-24 06:29:54,466 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@714]
- LOOKING
2015-09-24 06:29:54,584 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /10.0.8.58:36137
2015-09-24 06:29:54,584 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-09-24 06:29:54,584 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /10.0.8.58:36137 (no session established for client)
2015-09-24 06:29:54,679 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197]
- Accepted socket connection from /10.0.8.57:57410
2015-09-24 06:29:54,680 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362]
- Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-09-24 06:29:54,680 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /10.0.8.57:57410 (no session established for client)

I also observed that hadoop-zkfc restarts very frequently.
Any ideas what could be wrong?

Thanks.
Mime
View raw message