zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Jaton <benjamin.ja...@gmail.com>
Subject Failover when one node fails to write on the disk?
Date Wed, 07 Jan 2015 22:34:16 GMT
Using zookeeper 3.4.5 I came across a situation where all the 3 Zookeeper
suddenly stop.

What I see is that NODE1 fails to write on the disk. so it makes sense to
me that NODE1 stops.

But it is unclear why NODE2 and NODE3 would stop running as well, I have a
hard time making sense of the log messages.

Any insight would be greatly appreciated!

see log extracts below:

NODE1:

-- no log for several days before this --
2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] -
fsync-ing the write ahead log in SyncThread:1 took 11024ms which will
adversely effect operation latency. See the ZooKeeper troubleshooting guide
2015-01-04 16:18:22,380 [myid:1] - WARN
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
        at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
        at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2015-01-04 16:18:23,384 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:23,492 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:24,060 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running


NODE2:

-- no log for several days before this --
2015-01-04 16:18:21,899 [myid:3] - WARN
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
        at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
        at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2015-01-04 16:18:22,760 [myid:3] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:22,801 [myid:3] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:22,886 [myid:3] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running


NODE3 (leader):

-- no log for several days before this --
2015-01-04 16:18:21,897 [myid:2] - WARN
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing
connection to peer due to transaction timeout.
2015-01-04 16:18:21,898 [myid:2] - WARN
[LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - ******* GOODBYE
/204.53.107.249:43402 ********
2015-01-04 16:18:21,905 [myid:2] - WARN
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing
connection to peer due to transaction timeout.
2015-01-04 16:18:21,907 [myid:2] - WARN
[LearnerHandler-/204.53.107.247:45953:LearnerHandler@646] - ******* GOODBYE
/204.53.107.247:45953 ********
2015-01-04 16:18:21,918 [myid:2] - WARN
[LearnerHandler-/204.53.107.247:45953:LearnerHandler@658] - Ignoring
unexpected exception
java.lang.InterruptedException
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
        at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
        at
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
        at
org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:656)
        at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:649)
2015-01-04 16:18:23,003 [myid:2] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:23,007 [myid:2] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running
2015-01-04 16:18:23,115 [myid:2] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of
session 0x0 due to java.io.IOException: ZooKeeperServer not running


Thanks!
Benjamin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message