zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@apache.org>
Subject Re: Zookeeper is unable to start
Date Fri, 22 Apr 2016 05:01:27 GMT
The ensemble is electing a leader, but from this:

2016-04-21 11:06:44,705 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@518]
- Received NEWLEADER-ACK message from 2
2016-04-21 11:07:45,832 [myid:3] - ERROR [LearnerHandler-/192.168.1.2:35258:LearnerHandler@631]
- Unexpected exception causing shutdown while sock still open
java.io.EOFException

It looks like the connection to the follower is timing out and the leader is losing quorum,
that's the reason you're seeing these messages that the service isn't available. You need
to determine why the connection is dropping. Does it happen every time you start it?

-Flavio


> On 21 Apr 2016, at 19:39, Eric Gao <gaoqiangdba@163.com> wrote:
> 
> Sorry.
> 
> 
> Here are the logs:
> Master's zookeeper.out:
> 2016-04-21 11:30:06,645 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration
from: /opt/zookeeper/bin/../conf/zoo.cfg
> 2016-04-21 11:30:06,724 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.4 to address: /192.168.1.4
> 2016-04-21 11:30:06,725 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.3 to address: /192.168.1.3
> 2016-04-21 11:30:06,726 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.2 to address: /192.168.1.2
> 2016-04-21 11:30:06,727 [myid:] - INFO  [main:QuorumPeerConfig@331] - Defaulting to majority
quorums
> 2016-04-21 11:30:06,737 [myid:1] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount
set to 3
> 2016-04-21 11:30:06,738 [myid:1] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval
set to 0
> 2016-04-21 11:30:06,738 [myid:1] - INFO  [main:DatadirCleanupManager@101] - Purge task
is not scheduled.
> 2016-04-21 11:30:06,765 [myid:1] - INFO  [main:QuorumPeerMain@127] - Starting quorum
peer
> 2016-04-21 11:30:06,794 [myid:1] - INFO  [main:NIOServerCnxnFactory@89] - binding to
port 0.0.0.0/0.0.0.0:2181
> 2016-04-21 11:30:06,815 [myid:1] - INFO  [main:QuorumPeer@1019] - tickTime set to 2000
> 2016-04-21 11:30:06,815 [myid:1] - INFO  [main:QuorumPeer@1039] - minSessionTimeout set
to -1
> 2016-04-21 11:30:06,816 [myid:1] - INFO  [main:QuorumPeer@1050] - maxSessionTimeout set
to -1
> 2016-04-21 11:30:06,816 [myid:1] - INFO  [main:QuorumPeer@1065] - initLimit set to 10
> 2016-04-21 11:30:06,851 [myid:1] - INFO  [main:FileSnap@83] - Reading snapshot /opt/zookeeper/data/version-2/snapshot.200000000
> 2016-04-21 11:30:06,882 [myid:1] - INFO  [ListenerThread:QuorumCnxManager$Listener@534]
- My election bind port: /192.168.1.2:3887
> 2016-04-21 11:30:06,989 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@774]
- LOOKING
> 00
> 0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my
state)
> 2016-04-21 11:30:07,020 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@199]
- Have smaller server identifier, so dropping the connection: (2, 1)
> 2016-04-21 11:30:07,024 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumCnxManager@199]
- Have smaller server identifier, so dropping the connection: (3, 1)
> 2016-04-21 11:30:07,035 [myid:1] - INFO  [/192.168.1.2:3887:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.3:56474
> d), 0x1 (n.round), FOLLOWING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> d), 0x1 (n.round), FOLLOWING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:30:07,043 [myid:1] - INFO  [/192.168.1.2:3887:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.4:46486
> d), 0x1 (n.round), LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:30:07,050 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@844]
- FOLLOWING
> 2016-04-21 11:30:07,058 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@86]
- TCP NoDelay set to: true
> 02/06/2016 03:18 GMT
> 2016-04-21 11:30:07,071 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:host.name=master
> 2016-04-21 11:30:07,071 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.version=1.7.0_75
> 2016-04-21 11:30:07,072 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.vendor=Oracle Corporation
> njdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre
> lib
> lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2016-04-21 11:30:07,072 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.io.tmpdir=/tmp
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.compiler=<NA>
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.name=Linux
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.arch=amd64
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.version=3.10.0-229.el7.x86_64
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.name=hadoop
> 2016-04-21 11:30:07,073 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.home=/home/hadoop
> 2016-04-21 11:30:07,074 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.dir=/home/hadoop
> 4000 maxSessionTimeout 40000 datadir /opt/zookeeper/log/version-2 snapdir /opt/zookeeper/data/version-2
> 2016-04-21 11:30:07,081 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@63]
- FOLLOWING - LEADER ELECTION TOOK - 90
> 192.168.1.4
> d), 0x1 (n.round), LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) FOLLOWING (my state)
> 2016-04-21 11:30:07,178 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@326]
- Getting a diff from the leader 0x200000000
> ion-2/snapshot.200000000
> 
> 
> slave1's zookeeper.out:
> 2016-04-21 11:30:30,626 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration
from: /opt/zookeeper/bin/../conf/zoo.cfg
> 2016-04-21 11:30:30,658 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.4 to address: /192.168.1.4
> 2016-04-21 11:30:30,659 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.3 to address: /192.168.1.3
> 2016-04-21 11:30:30,662 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.2 to address: /192.168.1.2
> 2016-04-21 11:30:30,662 [myid:] - INFO  [main:QuorumPeerConfig@331] - Defaulting to majority
quorums
> 2016-04-21 11:30:30,668 [myid:2] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount
set to 3
> 2016-04-21 11:30:30,668 [myid:2] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval
set to 0
> 2016-04-21 11:30:30,668 [myid:2] - INFO  [main:DatadirCleanupManager@101] - Purge task
is not scheduled.
> 2016-04-21 11:30:30,686 [myid:2] - INFO  [main:QuorumPeerMain@127] - Starting quorum
peer
> 2016-04-21 11:30:30,697 [myid:2] - INFO  [main:NIOServerCnxnFactory@89] - binding to
port 0.0.0.0/0.0.0.0:2181
> 2016-04-21 11:30:30,706 [myid:2] - INFO  [main:QuorumPeer@1019] - tickTime set to 2000
> 2016-04-21 11:30:30,706 [myid:2] - INFO  [main:QuorumPeer@1039] - minSessionTimeout set
to -1
> 2016-04-21 11:30:30,706 [myid:2] - INFO  [main:QuorumPeer@1050] - maxSessionTimeout set
to -1
> 2016-04-21 11:30:30,707 [myid:2] - INFO  [main:QuorumPeer@1065] - initLimit set to 10
> 2016-04-21 11:30:30,731 [myid:2] - INFO  [main:FileSnap@83] - Reading snapshot /opt/zookeeper/data/version-2/snapshot.0
> 2016-04-21 11:30:30,742 [myid:2] - INFO  [ListenerThread:QuorumCnxManager$Listener@534]
- My election bind port: /192.168.1.3:3888
> 2016-04-21 11:30:30,756 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@774]
- LOOKING
> 2016-04-21 11:30:30,758 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@818]
- New election. My id =  2, proposed zxid=0x0
> d), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> 0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my
state)
> 2016-04-21 11:30:30,768 [myid:2] - INFO  [WorkerSender[myid=2]:QuorumCnxManager@199]
- Have smaller server identifier, so dropping the connection: (3, 2)
> 0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x2 (n.peerEpoch) LOOKING (my
state)
> 2016-04-21 11:30:30,771 [myid:2] - INFO  [/192.168.1.3:3888:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.4:35199
> 2016-04-21 11:30:30,772 [myid:2] - INFO  [WorkerSender[myid=2]:QuorumCnxManager@199]
- Have smaller server identifier, so dropping the connection: (3, 2)
> d), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:30:30,777 [myid:2] - INFO  [/192.168.1.3:3888:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.4:35200
> 2016-04-21 11:30:30,779 [myid:2] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@810]
- Connection broken for id 3, my id = 2, error = 
> java.net.SocketException: Socket closed
> at java.net.SocketInputStream.read(SocketInputStream.java:190)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.net.SocketInputStream.read(SocketInputStream.java:210)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
> d), 0x1 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:30:30,783 [myid:2] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@727]
- Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
> at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
> 2016-04-21 11:30:30,784 [myid:2] - WARN  [SendWorker:3:QuorumCnxManager$SendWorker@736]
- Send worker leaving thread
> 2016-04-21 11:30:30,783 [myid:2] - WARN  [RecvWorker:3:QuorumCnxManager$RecvWorker@813]
- Interrupting SendWorker
> d), 0x1 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> d), 0x1 (n.round), LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> d), 0x1 (n.round), LEADING (n.state), 3 (n.sid), 0x2 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:30:30,993 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@844]
- FOLLOWING
> 2016-04-21 11:30:31,000 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Learner@86]
- TCP NoDelay set to: true
> 02/06/2016 03:18 GMT
> 2016-04-21 11:30:31,012 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:host.name=slave1
> 2016-04-21 11:30:31,013 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.version=1.7.0_75
> 2016-04-21 11:30:31,013 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.vendor=Oracle Corporation
> njdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre
> lib
> lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.io.tmpdir=/tmp
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.compiler=<NA>
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.name=Linux
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.arch=amd64
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.version=3.10.0-229.el7.x86_64
> 2016-04-21 11:30:31,014 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.name=hadoop
> 2016-04-21 11:30:31,015 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.home=/home/hadoop
> 2016-04-21 11:30:31,015 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.dir=/home/hadoop
> 4000 maxSessionTimeout 40000 datadir /opt/zookeeper/log/version-2 snapdir /opt/zookeeper/data/version-2
> 2016-04-21 11:30:31,019 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@63]
- FOLLOWING - LEADER ELECTION TOOK - 261
> 192.168.1.2
> 87
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:228)
> at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:846)
> 87
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:228)
> at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:846)\
> 
> 
> 
> 
> slave2's zookeeper.out:
> 2016-04-21 11:06:42,294 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration
from: /opt/zookeeper/bin/../conf/zoo.cfg
> 2016-04-21 11:06:42,752 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.4 to address: /192.168.1.4
> 2016-04-21 11:06:42,753 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.3 to address: /192.168.1.3
> 2016-04-21 11:06:42,754 [myid:] - INFO  [main:QuorumPeer$QuorumServer@149] - Resolved
hostname: 192.168.1.2 to address: /192.168.1.2
> 2016-04-21 11:06:42,755 [myid:] - INFO  [main:QuorumPeerConfig@331] - Defaulting to majority
quorums
> 2016-04-21 11:06:42,768 [myid:3] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount
set to 3
> 2016-04-21 11:06:42,768 [myid:3] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval
set to 0
> 2016-04-21 11:06:42,768 [myid:3] - INFO  [main:DatadirCleanupManager@101] - Purge task
is not scheduled.
> 2016-04-21 11:06:42,808 [myid:3] - INFO  [main:QuorumPeerMain@127] - Starting quorum
peer
> 2016-04-21 11:06:42,832 [myid:3] - INFO  [main:NIOServerCnxnFactory@89] - binding to
port 0.0.0.0/0.0.0.0:2181
> 2016-04-21 11:06:42,847 [myid:3] - INFO  [main:QuorumPeer@1019] - tickTime set to 2000
> 2016-04-21 11:06:42,847 [myid:3] - INFO  [main:QuorumPeer@1039] - minSessionTimeout set
to -1
> 2016-04-21 11:06:42,847 [myid:3] - INFO  [main:QuorumPeer@1050] - maxSessionTimeout set
to -1
> 2016-04-21 11:06:42,847 [myid:3] - INFO  [main:QuorumPeer@1065] - initLimit set to 10
> 2016-04-21 11:06:42,971 [myid:3] - INFO  [ListenerThread:QuorumCnxManager$Listener@534]
- My election bind port: /192.168.1.4:3889
> 2016-04-21 11:06:43,001 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@774]
- LOOKING
> 2016-04-21 11:06:43,011 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@818]
- New election. My id =  3, proposed zxid=0x0
> 2016-04-21 11:06:43,161 [myid:3] - WARN  [WorkerSender[myid=3]:QuorumCnxManager@400]
- Cannot open channel to 2 at election address /192.168.1.3:3888
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:354)
> at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
> at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
> at java.lang.Thread.run(Thread.java:745)
> 2016-04-21 11:06:43,165 [myid:3] - INFO  [WorkerSender[myid=3]:QuorumPeer$QuorumServer@149]
- Resolved hostname: 192.168.1.3 to address: /192.168.1.3
> 2016-04-21 11:06:43,167 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600]
- Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING
(n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:06:43,192 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600]
- Notification: 1 (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:06:43,197 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600]
- Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:06:43,282 [myid:3] - INFO  [/192.168.1.4:3889:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.3:38903
> 2016-04-21 11:06:43,286 [myid:3] - INFO  [/192.168.1.4:3889:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.3:38904
> 2016-04-21 11:06:43,287 [myid:3] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@810]
- Connection broken for id 2, my id = 3, error = 
> java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:152)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.net.SocketInputStream.read(SocketInputStream.java:210)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
> 2016-04-21 11:06:43,288 [myid:3] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@727]
- Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
> at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
> 2016-04-21 11:06:43,288 [myid:3] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@813]
- Interrupting SendWorker
> 2016-04-21 11:06:43,289 [myid:3] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@736]
- Send worker leaving thread
> 2016-04-21 11:06:43,301 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600]
- Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING
(n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
> 2016-04-21 11:06:43,505 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@856]
- LEADING
> 2016-04-21 11:06:43,510 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@59]
- TCP NoDelay set to: true
> 2016-04-21 11:06:43,543 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
> 2016-04-21 11:06:43,543 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:host.name=slave2
> 2016-04-21 11:06:43,572 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.version=1.7.0_75
> 2016-04-21 11:06:43,573 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.vendor=Oracle Corporation
> 2016-04-21 11:06:43,573 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre
> 2016-04-21 11:06:43,573 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.class.path=/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.8.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf:/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.8.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf:.::/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/lib:/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre/lib
> 2016-04-21 11:06:43,573 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2016-04-21 11:06:43,574 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.io.tmpdir=/tmp
> 2016-04-21 11:06:43,574 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:java.compiler=<NA>
> 2016-04-21 11:06:43,574 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.name=Linux
> 2016-04-21 11:06:43,574 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.arch=amd64
> 2016-04-21 11:06:43,574 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:os.version=3.10.0-229.el7.x86_64
> 2016-04-21 11:06:43,575 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.name=hadoop
> 2016-04-21 11:06:43,575 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.home=/home/hadoop
> 2016-04-21 11:06:43,575 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Environment@100]
- Server environment:user.dir=/home/hadoop
> 2016-04-21 11:06:43,580 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@170]
- Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir
/opt/zookeeper/log/version-2 snapdir /opt/zookeeper/data/version-2
> 2016-04-21 11:06:43,582 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@361]
- LEADING - LEADER ELECTION TOOK - 571
> 2016-04-21 11:06:44,568 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@329]
- Follower sid: 1 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@4a7baf7d
> 2016-04-21 11:06:44,595 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@384]
- Synchronizing with Follower sid: 1 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x0
> 2016-04-21 11:06:44,596 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@393]
- leader and follower are in sync, zxid=0x0
> 2016-04-21 11:06:44,596 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@458]
- Sending DIFF
> 2016-04-21 11:06:44,615 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@518]
- Received NEWLEADER-ACK message from 1
> 2016-04-21 11:06:44,619 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@946]
- Have quorum of supporters, sids: [ 1,3 ]; starting up and setting last processed zxid: 0x200000000
> 2016-04-21 11:06:44,622 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@329]
- Follower sid: 2 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@72e042cb
> 2016-04-21 11:06:44,633 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@384]
- Synchronizing with Follower sid: 2 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x0
> 2016-04-21 11:06:44,633 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@393]
- leader and follower are in sync, zxid=0x0
> 2016-04-21 11:06:44,634 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@458]
- Sending DIFF
> 2016-04-21 11:06:44,705 [myid:3] - INFO  [LearnerHandler-/192.168.1.3:54214:LearnerHandler@518]
- Received NEWLEADER-ACK message from 2
> 2016-04-21 11:07:45,832 [myid:3] - ERROR [LearnerHandler-/192.168.1.2:35258:LearnerHandler@631]
- Unexpected exception causing shutdown while sock still open
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
> at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:542)
> 2016-04-21 11:07:45,834 [myid:3] - WARN  [LearnerHandler-/192.168.1.2:35258:LearnerHandler@644]
- ******* GOODBYE /192.168.1.2:35258 ********
> 2016-04-21 11:07:45,874 [myid:3] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@810]
- Connection broken for id 1, my id = 3, error = 
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:795)
> 2016-04-21 11:07:45,876 [myid:3] - WARN  [RecvWorker:1:QuorumCnxManager$RecvWorker@813]
- Interrupting SendWorker
> 2016-04-21 11:07:45,888 [myid:3] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@727]
- Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
> at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:879)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:65)
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:715)
> 2016-04-21 11:07:45,889 [myid:3] - WARN  [SendWorker:1:QuorumCnxManager$SendWorker@736]
- Send worker leaving thread
> 2016-04-21 11:07:58,881 [myid:3] - INFO  [/192.168.1.4:3889:QuorumCnxManager$Listener@541]
- Received connection request /192.168.1.2:51099
> 2016-04-21 11:07:58,915 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600]
- Notification: 1 (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING
(n.state), 1 (n.sid), 0x2 (n.peerEpoch) LEADING (my state)
> 2016-04-21 11:07:59,150 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35263:LearnerHandler@329]
- Follower sid: 1 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@4a7baf7d
> 2016-04-21 11:07:59,155 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35263:LearnerHandler@384]
- Synchronizing with Follower sid: 1 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x0
> 2016-04-21 11:07:59,155 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35263:LearnerHandler@458]
- Sending SNAP
> 2016-04-21 11:07:59,156 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35263:LearnerHandler@482]
- Sending snapshot last zxid of peer is 0x0  zxid of leader is 0x200000000sent zxid of db
as 0x200000000
> 2016-04-21 11:07:59,212 [myid:3] - INFO  [LearnerHandler-/192.168.1.2:35263:LearnerHandler@518]
- Received NEWLEADER-ACK message from 1
> 
> 
> Eric Gao
> Keep on going never give up.
> Blog:
> http://gaoqiang.blog.chinaunix.net/
> http://gaoqiangdba.blog.163.com/
> 
> 
> 
> From: Camille Fournier
> Date: 2016-04-22 10:25
> To: user@zookeeper.apache.org
> Subject: Re: Zookeeper is unable to start
> Your attachments won't come through. Can you look at the log files and copy
> some of the last log lines into email?
> 
> 
> On Thu, Apr 21, 2016 at 10:23 PM, Eric Gao <gaoqiangdba@163.com> wrote:
> 
>> Dear exports,
>> I have encountered a problem when I have started the zookeeper:
>> 
>> [root@master data]# /opt/zookeeper/bin/zkServer.sh start
>> ZooKeeper JMX enabled by default
>> Using config: /opt/zookeeper/bin/../conf/zoo.cfg
>> Starting zookeeper ... date
>> STARTED
>> 
>> [root@master ~]# /opt/zookeeper/bin/zkServer.sh status
>> ZooKeeper JMX enabled by default
>> Using config: /opt/zookeeper/bin/../conf/zoo.cfg
>> Error contacting service. It is probably not running.
>> 
>> All the 3 servers have the same status.
>> 
>> 
>> Here is the configuration information:
>> 
>> zoo.cfg:
>> tickTime=2000
>> initLimit=10
>> syncLimit=5
>> clientPort=2181
>> server.1=192.168.1.2:2887:3887
>> server.2=192.168.1.3:2888:3888
>> server.3=192.168.1.4:2889:3889
>> dataDir=/opt/zookeeper/data
>> dataLogDir=/opt/zookeeper/log
>> 
>> 
>> And the 3 servers' zookeeper.out are put in attachment files.
>> 
>> Thanks for your attention.
>> 
>> 
>> ------------------------------
>> *Eric Gao*
>> Keep on going never give up.
>> *Blog:*
>> http://gaoqiang.blog.chinaunix.net/
>> http://gaoqiangdba.blog.163.com/
>> 
>> 
>> 


Mime
View raw message