zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: zookeeper cluster
Date Fri, 13 Jun 2014 21:44:04 GMT
It's possible that two of the nodes can talk to each other, but the third
can't. This means that when all three are running you will get a quorum
because two can connect to each other. Once one of these two is shut down,
you will not be able to reform a quorum. I would check via something simple
like telnet. That you can telnet from each host onto each of the other
hosts at the appropriate ports you have configured.


On Fri, Jun 13, 2014 at 11:39 PM, Lahiru Gunathilake <glahiru@gmail.com>
wrote:

> Thanks all for the response but I still couldn't figure out why its not
> working. If I configured the cluster it should give an error first place.
> When I kill the leader it fails and at the same time when I kill a follower
> and try to start it again it doesn't work either, but the other nodes in
> the cluster works fine.
>
> When kill the leader I see following error in one of the followers,
>
> 2014-06-13 09:35:37,215 [myid:1] - WARN  [QuorumPeer[myid=1]/0.0.0.0:2181
> :Learner@233] - Unexpected exception, tries=1, connecting to /
> 129.79.247.5:2888
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:432)
> at java.net.Socket.connect(Socket.java:529)
> at
>
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
>
>
> I can see 129.79.247.5 is the other follower and something is wrong. But
> what I do not understand is why this is not coming when I start the cluster
> at the first place, because when I start the cluster initially it finish
> the voting process successfully then one became a leader and rest became
> follower.
>
> Regards
> Lahiru
>
>
>
> On Thu, Jun 12, 2014 at 9:56 PM, James A. Robinson <jimr@highwire.org>
> wrote:
>
> > On Thu, Jun 12, 2014 at 4:47 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> > wrote:
> >
> > > This is not correct, 3 is a minimum for redundancy. If 1 goes down, the
> > > other 2 can still form a quorum (as there are more than half of them
> > > remaining).
> > >
> >
> > Thank you, it's good to know this -- I must have gotten confused
> > about the way the quorum logic worked at some point.
> >
> > Jim
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message