zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: Node not joining ensemble
Date Sun, 23 Oct 2011 10:36:34 GMT
Here is my interpretation after reading the logs:

1- Node 3 was restarted and initiated leader election for round 1;
2- Node 3 received a notification from 1 saying that it is the leader,  
but it didn't get a confirmation from a quorum. Since node 3 has a  
higher id and zxid, it does not change its mind about who should be  
the leader: itself;
3- Node 3 didn't receive a notification from 2 showing that a quorum  
supports 1, so node 3 sticks to its vote.

It sound like a bug to me, so I suggest you report it on a jira.

-Flavio

On Oct 22, 2011, at 3:13 AM, Jordan Zimmerman wrote:

> Interesting. I restarted Server 2 in the ensemble and the problem  
> cleared
> itself.
>
> -JZ
>
> On 10/21/11 4:34 PM, "Jordan Zimmerman" <jzimmerman@netflix.com>  
> wrote:
>
>> FYI - I turned on DEBUG and here's more log info:
>>
>> 2011-10-21 23:33:06,732 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:06,732 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:06,734 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:06,735 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:08,336 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:QuorumCnxManager@414] - Queue size: 0
>> 2011-10-21 23:33:08,336 - INFO
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification  
>> time out:
>> 3200
>> 2011-10-21 23:33:08,336 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 1
>> 2011-10-21 23:33:08,337 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 2
>> 2011-10-21 23:33:08,337 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:08,337 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 3 (n.leader),  
>> 12885265585
>> (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:08,337 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:08,337 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:08,339 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:08,339 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:11,540 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:QuorumCnxManager@414] - Queue size: 0
>> 2011-10-21 23:33:11,540 - INFO
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification  
>> time out:
>> 6400
>> 2011-10-21 23:33:11,540 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 1
>> 2011-10-21 23:33:11,541 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 2
>> 2011-10-21 23:33:11,541 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:11,541 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 3 (n.leader),  
>> 12885265585
>> (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:11,541 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:11,541 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:11,543 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:11,544 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:17,945 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:QuorumCnxManager@414] - Queue size: 0
>> 2011-10-21 23:33:17,945 - INFO
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification  
>> time out:
>> 12800
>> 2011-10-21 23:33:17,945 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 1
>> 2011-10-21 23:33:17,946 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 2
>> 2011-10-21 23:33:17,946 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:17,946 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 3 (n.leader),  
>> 12885265585
>> (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:17,946 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:17,946 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:17,948 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:17,948 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:30,750 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:QuorumCnxManager@414] - Queue size: 0
>> 2011-10-21 23:33:30,750 - INFO
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification  
>> time out:
>> 25600
>> 2011-10-21 23:33:30,750 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 1
>> 2011-10-21 23:33:30,750 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 2
>> 2011-10-21 23:33:30,751 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:30,751 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 3 (n.leader),  
>> 12885265585
>> (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:30,751 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:30,751 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:30,753 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:30,753 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:56,357 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:QuorumCnxManager@414] - Queue size: 0
>> 2011-10-21 23:33:56,357 - INFO
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification  
>> time out:
>> 51200
>> 2011-10-21 23:33:56,358 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 1
>> 2011-10-21 23:33:56,358 - DEBUG [WorkerSender  
>> Thread:QuorumCnxManager@389]
>> - There is a connection already for server 2
>> 2011-10-21 23:33:56,358 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:56,358 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 3 (n.leader),  
>> 12885265585
>> (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my  
>> state)
>> 2011-10-21 23:33:56,358 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed  
>> id: 3,
>> zxid: 12885265585, proposed zxid: 12885265585
>> 2011-10-21 23:33:56,359 - DEBUG
>> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding vote:  
>> From = 3,
>> Proposed leader = 3, Porposed zxid = 12885265585, Proposed epoch = 1
>> 2011-10-21 23:33:56,360 - DEBUG [WorkerReceiver
>> Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new
>> notification message. My id = 3
>> 2011-10-21 23:33:56,360 - INFO  [WorkerReceiver
>> Thread:FastLeaderElection@496] - Notification: 1 (n.leader),  
>> 8589935532
>> (n.zxid), 3 (n.round), LEADING (n.state), 1 (n.sid), LOOKING (my  
>> state)
>> (END)
>>
>>
>>
>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message