zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michi Mutsuzaki <mi...@cs.stanford.edu>
Subject Re: [Zookeeper] Zookeeper Cluster broken due to snapshot corrupted error
Date Mon, 24 Mar 2014 04:05:25 GMT
I wonder if this is related to ZOOKEEPER-1697.

https://issues.apache.org/jira/browse/ZOOKEEPER-1697

--Michi

On Sun, Mar 23, 2014 at 6:15 PM, Jung Young Seok
<jung.youngseok@gmail.com> wrote:
> I've added zookeeper log (192.168.161.1).
> The time that the log was written look different but you might ignore it.
> Logs on 192.168.161.1 had been repeated with below pattern.
>
> Thank you for your asking.
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2014-03-19 17:28:06,105 [myid:3] - INFO
> [LearnerHandler-/10.0.33.129:49809:LearnerHandler@395] - Sending DIFF
> 2014-03-19 17:28:07,414 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41252
> 2014-03-19 17:28:07,415 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:07,415 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41252 (no session established for
> client)
> 2014-03-19 17:28:12,173 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41255
> 2014-03-19 17:28:12,174 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:12,174 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41255 (no session established for
> client)
> 2014-03-19 17:28:14,558 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41258
> 2014-03-19 17:28:14,559 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:14,559 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41258 (no session established for
> client)
> 2014-03-19 17:28:18,585 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41261
> 2014-03-19 17:28:18,586 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:18,586 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41261 (no session established for
> client)
> 2014-03-19 17:28:20,067 [myid:3] - WARN
> [LearnerHandler-/10.0.33.1:58546:Leader@574] - Commiting zxid 0xc500000000
> from /10.0.161.1:2888 not first!
> 2014-03-19 17:28:20,067 [myid:3] - WARN
> [LearnerHandler-/10.0.33.1:58546:Leader@576] - First is 0x0
> 2014-03-19 17:28:20,068 [myid:3] - INFO
> [LearnerHandler-/10.0.33.1:58546:Leader@598] - Have quorum of supporters;
> starting up and setting last processed zxid: 0xc500000000
> 2014-03-19 17:28:22,312 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@490] - Shutting down
> 2014-03-19 17:28:22,312 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@496] - Shutdown called
> java.lang.Exception: shutdown Leader! reason: Only 1 followers, need 1
>         at
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:496)
>         at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:471)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:753)
> 2014-03-19 17:28:22,313 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting
> down
> 2014-03-19 17:28:22,320 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SessionTrackerImpl@225] - Shutting
> down
> 2014-03-19 17:28:22,320 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:PrepRequestProcessor@743] -
> Shutting down
> 2014-03-19 17:28:22,321 [myid:3] - INFO  [ProcessThread(sid:3
> cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
> 2014-03-19 17:28:22,321 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ProposalRequestProcessor@88] -
> Shutting down
> 2014-03-19 17:28:22,322 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - Shutting
> down
> 2014-03-19 17:28:22,322 [myid:3] - INFO
> [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop!
> 2014-03-19 17:28:22,322 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader$ToBeAppliedRequestProcessor@655]
> - Shutting down
> 2014-03-19 17:28:22,322 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] -
> shutdown of request processor complete
> 2014-03-19 17:28:22,323 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@175] -
> Shutting down
> 2014-03-19 17:28:22,323 [myid:3] - INFO
> [SyncThread:3:SyncRequestProcessor@155] - SyncRequestProcessor exited!
> 2014-03-19 17:28:22,325 [myid:3] - WARN
> [LearnerHandler-/10.0.33.1:58546:LearnerHandler@575] - ******* GOODBYE
> /10.0.33.1:58546 ********
> 2014-03-19 17:28:22,326 [myid:3] - WARN
> [LearnerHandler-/10.0.33.129:49809:LearnerHandler@575] - ******* GOODBYE
> /10.0.33.129:49809 ********
> 2014-03-19 17:28:22,327 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
> 2014-03-19 17:28:22,328 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot
> /home/zookeeper/data/version-2/snapshot.c200000001
> 2014-03-19 17:28:22,332 [myid:3] - INFO
> [Thread-140:Leader$LearnerCnxAcceptor@309] - exception while shutting down
> acceptor: java.net.SocketException: Socket closed
> 2014-03-19 17:28:24,004 [myid:3] - INFO
> [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
> 2014-03-19 17:28:27,398 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41264
> 2014-03-19 17:28:27,399 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:27,399 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41264 (no session established for
> client)
> 2014-03-19 17:28:34,987 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41267
> 2014-03-19 17:28:34,988 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:34,988 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41267 (no session established for
> client)
> 2014-03-19 17:28:35,218 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - New
> election. My id =  3, proposed zxid=0xc200000001
> 2014-03-19 17:28:35,219 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:35,420 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:35,420 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 400
> 2014-03-19 17:28:35,821 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:35,822 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 800
> 2014-03-19 17:28:36,623 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:36,623 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 1600
> 2014-03-19 17:28:36,800 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING (n.state), 1
> (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:37,096 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING (n.state), 2
> (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:37,097 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING (n.state), 2
> (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:38,698 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:38,698 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 3200
> 2014-03-19 17:28:38,700 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING (n.state), 1
> (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:38,705 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING (n.state), 2
> (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:39,408 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41270
> 2014-03-19 17:28:39,409 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:39,409 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41270 (no session established for
> client)
> 2014-03-19 17:28:41,906 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:41,906 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 6400
> 2014-03-19 17:28:42,390 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41273
> 2014-03-19 17:28:42,390 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:42,391 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41273 (no session established for
> client)
> 2014-03-19 17:28:44,729 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41276
> 2014-03-19 17:28:44,730 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:44,730 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41276 (no session established for
> client)
> 2014-03-19 17:28:48,307 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
> Notification time out: 12800
> 2014-03-19 17:28:48,308 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 3
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:49,840 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 1
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:49,841 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 1
> (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state)
> 2014-03-19 17:28:50,042 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@750] - LEADING
> 2014-03-19 17:28:50,042 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created
> server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000
> datadir /home/zookeeper/data/version-2 snapdir
> /home/zookeeper/data/version-2
> 2014-03-19 17:28:50,042 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@345] - LEADING - LEADER
> ELECTION TOOK - 27714
> 2014-03-19 17:28:50,045 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot
> /home/zookeeper/data/version-2/snapshot.c200000001
> 2014-03-19 17:28:50,540 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41279
> 2014-03-19 17:28:50,541 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:50,541 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41279 (no session established for
> client)
> 2014-03-19 17:28:51,406 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 2
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 2
> (n.sid), 0xc5 (n.peerEPoch), LEADING (my state)
> 2014-03-19 17:28:51,406 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3
> (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING (n.state), 2
> (n.sid), 0xc5 (n.peerEPoch), LEADING (my state)
> 2014-03-19 17:28:53,526 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41282
> 2014-03-19 17:28:53,526 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:53,527 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41282 (no session established for
> client)
> 2014-03-19 17:28:59,322 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41285
> 2014-03-19 17:28:59,323 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:28:59,323 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41285 (no session established for
> client)
> 2014-03-19 17:29:00,253 [myid:3] - INFO
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - Snapshotting:
> 0xc200000001 to /home/zookeeper/data/version-2/snapshot.c200000001
> 2014-03-19 17:29:04,860 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:29:04,860 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41288 (no session established for
> client)
> 2014-03-19 17:29:11,031 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41291
> 2014-03-19 17:29:11,032 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:29:11,032 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41291 (no session established for
> client)
> 2014-03-19 17:29:16,490 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41294
> 2014-03-19 17:29:16,491 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:29:16,491 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41294 (no session established for
> client)
> 2014-03-19 17:29:19,064 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
> Accepted socket connection from /10.0.160.243:41297
> 2014-03-19 17:29:19,065 [myid:3] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not
> running
> 2014-03-19 17:29:19,065 [myid:3] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /10.0.160.243:41297 (no session established for
> client)
> 2014-03-19 17:29:19,312 [myid:3] - INFO
> [LearnerHandler-/10.0.33.1:58547:LearnerHandler@263] - Follower sid: 1 :
> info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3c966db5
> 2014-03-19 17:29:19,314 [myid:3] - INFO
> [LearnerHandler-/10.0.33.129:49810:LearnerHandler@263] - Follower sid: 2 :
> info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@466b56b
> 2014-03-19 17:29:19,475 [myid:3] - ERROR
> [LearnerHandler-/10.0.33.1:58547:LearnerHandler@562] - Unexpected exception
> causing shutdown while sock still open
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:290)
> 2014-03-19 17:29:19,476 [myid:3] - WARN
> [LearnerHandler-/10.0.33.1:58547:LearnerHandler@575] - ******* GOODBYE
> /10.0.33.1:58547 ********
> 2014-03-19 17:29:19,476 [myid:3] - ERROR
> [LearnerHandler-/10.0.33.129:49810:LearnerHandler@562] - Unexpected
> exception causing shutdown while sock still open
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:290)
> 2014-03-19 17:29:19,477 [myid:3] - WARN
> [LearnerHandler-/10.0.33.129:49810:LearnerHandler@575] - ******* GOODBYE
> /10.0.33.129:49810 ********
> 2014-03-19 17:29:21,757 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1
> (n.leader), 0xc200000001 (n.zxid), 0x128 (n.round), LOOKING (n.state), 1
> (n.sid), 0xc5 (n.peerEPoch), LEADING (my state)
>
>
>
> 2014-03-23 12:05 GMT+09:00 Michi Mutsuzaki <michi@cs.stanford.edu>:
>
>> Hi Youngseok,
>>
>> Could you post the log file from 192.168.161.1? The log file you
>> posted indicates that 192.168.33.1 is not able to connect to
>> 192.168.161.1.
>>
>> Thanks!
>> --Michi
>>
>>
>> On Fri, Mar 21, 2014 at 12:14 AM, Jung Young Seok
>> <jung.youngseok@gmail.com> wrote:
>> > Dear Zookeeper usergroup members,
>> >
>> > I have some questions.
>> >
>> > We're currently use Zookeeper 3.4.5 with clustering 3 nodes.
>> > We got zookeeper service stopped all of sudden so client wasn't able to
>> > connect to zookeeper server.
>> > In that situation,  zookeepers couldn't elect leader each other.
>> >
>> > Then I restarted zookeeper service (all of them) but could't elect
>> > leader
>> > and be follower.
>> > So I rebooted linux but same happened. (I lost zookeeper log here t.t)
>> > When I removed snapshot files in data directory, the zookeeper worked
>> > okay.
>> > I have uploaded my zookeeper snapshot here
>> >  -
>> > https://s3-ap-northeast-1.amazonaws.com/zookeeper-logs/data_org_b1.tar
>> >
>> > If I push the snapshot into data directory, zookeeper clustering fail
>> > reappears again.
>> >
>> > My question is
>> >  1. why the snapshot was corrupted all of sudden?
>> >  2. Is there any way I can avoid this snapshot corruption issue?
>> >
>> > I've attached zoo.cfg and some of error log.
>> >
>> > I'd be happy if I get any opinion.
>> > Thank You.
>> >
>> > Best Regards
>> > Youngseok Jung
>> >
>> >
>> > #zoo.cfg (pretty much default setting)
>> > tickTime=2000
>> > initLimit=10
>> > syncLimit=5
>> > dataDir=/home/zookeeper/data
>> > clientPort=2181
>> >
>> > server.1=192.168.33.1:2888:3888
>> > server.2=192.168.33.129:2888:3888
>> > server.3=192.168.161.1:2888:3888
>> > autopurge.snapRetainCount=3
>> > autopurge.purgeInterval=1
>> >
>> >
>> > #Some of error log
>> > 2014-03-19 17:56:24,737 [myid:1] - INFO
>> >  [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2
>> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING (n.state), 2
>> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state)
>> > 2014-03-19 17:56:24,737 [myid:1] - WARN
>> >  [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel to 3
>> > at
>> > election address /10.0.161.1:3888
>> > java.net.ConnectException: Connection refused
>> >         at java.net.PlainSocketImpl.socketConnect(Native Method)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>> >         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>> >         at java.net.Socket.connect(Socket.java:579)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365)
>> >         at java.lang.Thread.run(Thread.java:724)
>> > 2014-03-19 17:56:25,537 [myid:1] - INFO
>> >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] -
>> > Notification time out: 1600
>> > 2014-03-19 17:56:25,538 [myid:1] - INFO
>> >  [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1
>> > (n.leader), 0xc200000001 (n.zxid), 0x145 (n.round), LOOKING (n.state), 1
>> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state)
>> > 2014-03-19 17:56:25,540 [myid:1] - INFO
>> >  [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2
>> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING (n.state), 2
>> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state)
>> > 2014-03-19 17:56:25,540 [myid:1] - WARN
>> >  [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel to 3
>> > at
>> > election address /10.0.161.1:3888
>> > java.net.ConnectException: Connection refused
>> >         at java.net.PlainSocketImpl.socketConnect(Native Method)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>> >         at
>> >
>> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>> >         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>> >         at java.net.Socket.connect(Socket.java:579)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393)
>> >         at
>> >
>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365)
>> >         at java.lang.Thread.run(Thread.java:724)
>
>

Mime
View raw message