zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From German Blanco <german.blanco.bla...@gmail.com>
Subject Re: New server cannot join quorum
Date Thu, 07 Nov 2013 04:33:26 GMT
Hello again,

I don't think it is a good a idea to start a new thread with the same issue.

could this be a DNS resolution caching problem?
See https://issues.apache.org/jira/browse/ZOOKEEPER-1506

The new server has the lowest sid. It is able to connect to all other
servers, but the rest of the servers don't seem able to connect to it.
Connections from this server to the rest are useless, since they are
dropped because of the sid comparison that you see in the log.

You could try to change the server address in the configuration for the AWS
public IP address of the peers, just to test if that works ok. Or try
replacing the server with the highest sid, that should also work. Otherwise
(assuming the problem is DNS resolution), the only current workaround that
I can think of is the rolling restart, as you have noticed.



On Wed, Nov 6, 2013 at 9:51 AM, Bae, Jae Hyeon <metacret@gmail.com> wrote:

> Hi Zookeeper users
>
> With the same zoo.cfg, new server with empty zk data directory cannot join
> quorum with the same IP, same version of zk and the port. I didn't see any
> significant error messages but the following lines repeated:
>
> 2013-11-05 17:42:08,287 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :QuorumPeer@670] - LOOKING
> 2013-11-05 17:42:08,290 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :FastLeaderElection@740] - New election. My id =  1, proposed zxid=0x0
> 2013-11-05 17:42:08,293 - INFO
>  [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1
> (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0
> (n.peerEPoch), LOOKING (my state)
> 2013-11-05 17:42:08,301 - INFO  [WorkerSender[myid=1]:QuorumCnxManager@190
> ]
> - Have smaller server identifier, so dropping the connection: (2, 1)
> 2013-11-05 17:42:08,304 - INFO  [WorkerSender[myid=1]:QuorumCnxManager@190
> ]
> - Have smaller server identifier, so dropping the connection: (3, 1)
> 2013-11-05 17:42:08,308 - INFO  [WorkerSender[myid=1]:QuorumCnxManager@190
> ]
> - Have smaller server identifier, so dropping the connection: (4, 1)
> 2013-11-05 17:42:08,311 - INFO  [WorkerSender[myid=1]:QuorumCnxManager@190
> ]
> - Have smaller server identifier, so dropping the connection: (5, 1)
> 2013-11-05 17:42:08,511 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :QuorumCnxManager@190] - Have smaller server identifier, so dropping the
> connection: (5, 1)
> 2013-11-05 17:42:08,515 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :QuorumCnxManager@190] - Have smaller server identifier, so dropping the
> connection: (2, 1)
> 2013-11-05 17:42:08,518 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :QuorumCnxManager@190] - Have smaller server identifier, so dropping the
> connection: (3, 1)
> 2013-11-05 17:42:08,522 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :QuorumCnxManager@190] - Have smaller server identifier, so dropping the
> connection: (4, 1)
> 2013-11-05 17:42:08,523 - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181
> :FastLeaderElection@774] - Notification time out: 400
>
> Do you have any idea what I am doing wrong here? I asked the same question
> yesterday and I got response the new server should start normally, sync and
> join quorum successfully.
>
> Thank you
> Best, Jae
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message