zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Mallassi <olivier.malla...@gmail.com>
Subject Re: zookeeper cluster
Date Sat, 14 Jun 2014 06:58:45 GMT
To be sure and diagnose I would also change gw110.iu.xsede.org by IP of the
machine (to avoid layer, dns caching or...)
At the beginning, when you start the cluster you can check the ensemble
with http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkCommands
You Will be able to know how many followers you have at init

Le vendredi 13 juin 2014, Cameron McKenzie <mckenzie.cam@gmail.com> a
écrit :

> It's possible that two of the nodes can talk to each other, but the third
> can't. This means that when all three are running you will get a quorum
> because two can connect to each other. Once one of these two is shut down,
> you will not be able to reform a quorum. I would check via something simple
> like telnet. That you can telnet from each host onto each of the other
> hosts at the appropriate ports you have configured.
>
>
> On Fri, Jun 13, 2014 at 11:39 PM, Lahiru Gunathilake <glahiru@gmail.com
> <javascript:;>>
> wrote:
>
> > Thanks all for the response but I still couldn't figure out why its not
> > working. If I configured the cluster it should give an error first place.
> > When I kill the leader it fails and at the same time when I kill a
> follower
> > and try to start it again it doesn't work either, but the other nodes in
> > the cluster works fine.
> >
> > When kill the leader I see following error in one of the followers,
> >
> > 2014-06-13 09:35:37,215 [myid:1] - WARN  [QuorumPeer[myid=1]/
> 0.0.0.0:2181
> > :Learner@233] - Unexpected exception, tries=1, connecting to /
> > 129.79.247.5:2888
> > java.net.ConnectException: Connection refused
> > at java.net.PlainSocketImpl.socketConnect(Native Method)
> > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
> > at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
> > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
> > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:432)
> > at java.net.Socket.connect(Socket.java:529)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225)
> > at
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
> > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> >
> >
> > I can see 129.79.247.5 is the other follower and something is wrong. But
> > what I do not understand is why this is not coming when I start the
> cluster
> > at the first place, because when I start the cluster initially it finish
> > the voting process successfully then one became a leader and rest became
> > follower.
> >
> > Regards
> > Lahiru
> >
> >
> >
> > On Thu, Jun 12, 2014 at 9:56 PM, James A. Robinson <jimr@highwire.org
> <javascript:;>>
> > wrote:
> >
> > > On Thu, Jun 12, 2014 at 4:47 PM, Cameron McKenzie <
> > mckenzie.cam@gmail.com <javascript:;>>
> > > wrote:
> > >
> > > > This is not correct, 3 is a minimum for redundancy. If 1 goes down,
> the
> > > > other 2 can still form a quorum (as there are more than half of them
> > > > remaining).
> > > >
> > >
> > > Thank you, it's good to know this -- I must have gotten confused
> > > about the way the quorum logic worked at some point.
> > >
> > > Jim
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message