zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Jagtap <deepak.jag...@maxta.com>
Subject Re: New zookeeper server fails to join quorum with msg "Have smaller server identifie"
Date Wed, 05 Mar 2014 19:50:06 GMT
Hi,

I have applied only 1805 patch, not 1810.
And upgrade is from 3.5.0.1458648 to 3.5.0.1562289 (not from 3.4.5).
It was failing very consistently in our environment, and after 1805 patch
it went smoothly.

Regards,
Deepak





On Wed, Mar 5, 2014 at 7:36 AM, German Blanco <
german.blanco.blanco@gmail.com> wrote:

> Hello,
>
> do you mean ZOOKEEPER-1810 patch?
> That one alone doesn't solve the problem. On the other hand, the problem
> doesn't happen always, so after a rolling start it might get solved.
> We need 1818 as well, but it is easier to go step by step and get 1810 in
> trunk first.
> I hope that as soon as 3.4.6 is out this might get some attention.
>
> Regards,
>
> German.
>
>
> On Wed, Mar 5, 2014 at 2:17 AM, Deepak Jagtap <deepak.jagtap@maxta.com
> >wrote:
>
> > Hi,
> >
> > Please ignore the previous comment, I used wrong jar file and hence
> rolling
> > upgrade failed.
> > After applying patch for bug  on zookeeper-3.5.0.1562289
> > revision, rolling upgrade went fine.
> >
> > I have patched in house zookeeper version, but it would be convenient if
> we
> > apply patch on trunk and use the latest trunk.
> > Please advise if I can apply the patch on the trunk and test it for you.
> >
> > Thanks & Regards,
> > Deepak
> >
> >
> > On Tue, Mar 4, 2014 at 12:09 PM, Deepak Jagtap <deepak.jagtap@maxta.com
> > >wrote:
> >
> > > Hi German,
> > >
> > > I tried applying patch for 1805 but problem still persists.
> > > Following are the notification messages logged repeatedly by the node
> > > which fails to join the quorum:
> > >
> > >
> > > 2014-03-04 20:00:54,398 [myid:2] - INFO
> > >  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@837] -
> > > Notification time out: 51200
> > > 2014-03-04 20:00:54,400 [myid:2] - INFO
> > >  [WorkerReceiver[myid=2]:FastLeaderElection@605] - Notification: 2
> > > (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid),
> > 0x0
> > > (n.peerEPoch), LOOKING (my state)1 (n.config version)
> > > 2014-03-04 20:00:54,401 [myid:2] - INFO
> > >  [WorkerReceiver[myid=2]:FastLeaderElection@605] - Notification: 3
> > > (n.leader), 0x100003e84 (n.zxid), 0x2 (n.round), FOLLOWING (n.state), 1
> > > (n.sid), 0x1 (n.peerEPoch), LOOKING (my state)1 (n.config version)
> > > 2014-03-04 20:00:54,403 [myid:2] - INFO
> > >  [WorkerReceiver[myid=2]:FastLeaderElection@605] - Notification: 3
> > > (n.leader), 0x100003e84 (n.zxid), 0xffffffffffffffff (n.round), LEADING
> > > (n.state), 3 (n.sid), 0x2 (n.peerEPoch), LOOKING (my state)1 (n.config
> > > version)
> > >
> > >
> > >
> > > Patch for 1732 is already included in the trunk.
> > >
> > >
> > > Thanks & Regards,
> > > Deepak
> > >
> > >
> > > On Fri, Feb 28, 2014 at 2:58 PM, Deepak Jagtap <
> deepak.jagtap@maxta.com
> > >wrote:
> > >
> > >> Hi Flavio, German,
> > >>
> > >> Since this fix is critical for zookeeper rolling upgrade is it ok if I
> > >> apply this patch to 3.5.0 trunk?
> > >> Is it straightforward to apply this patch to trunk?
> > >>
> > >> Thanks & Regards,
> > >> Deepak
> > >>
> > >>
> > >> On Wed, Feb 26, 2014 at 11:46 AM, Deepak Jagtap <
> > deepak.jagtap@maxta.com>wrote:
> > >>
> > >>> Thanks German!
> > >>> Just wondering is there any chance that this patch may be applied to
> > >>> trunk in near future?
> > >>> If it's fine with you guys, I would be more than happy to apply the
> > >>> fixes (from 3.4.5) to trunk and test them.
> > >>>
> > >>> Thanks & Regards,
> > >>> Deepak
> > >>>
> > >>>
> > >>> On Wed, Feb 26, 2014 at 1:29 AM, German Blanco <
> > >>> german.blanco.blanco@gmail.com> wrote:
> > >>>
> > >>>> Hello Deepak,
> > >>>>
> > >>>> due to ZOOKEEPER-1732 and then ZOOKEEPER-1805, there are some cases
> in
> > >>>> which an ensemble can be formed so that it doesn't allow any other
> > >>>> zookeeper server to join.
> > >>>> This has been fixed in branch 3.4, but it hasn't been fixed in
trunk
> > >>>> yet.
> > >>>> Check if the Notifications sent around contain different values
for
> > the
> > >>>> vote in the members of the ensemble.
> > >>>> If you force a new election (e.g. by killing the leader) I guess
> > >>>> everything
> > >>>> should work normally, but don't take my word for it.
> > >>>> Flavio should know more about this.
> > >>>>
> > >>>> Cheers,
> > >>>>
> > >>>> German.
> > >>>>
> > >>>>
> > >>>> On Wed, Feb 26, 2014 at 4:04 AM, Deepak Jagtap <
> > deepak.jagtap@maxta.com
> > >>>> >wrote:
> > >>>>
> > >>>> > Hi,
> > >>>> >
> > >>>> > I replacing one of the zookeeper server from 3 node quorum.
> > >>>> > Initially all zookeeper serves were running 3.5.0.1515976
version.
> > >>>> > I successfully replaced Node3 with newer version 3.5.0.1551730.
> > >>>> > When I am trying to replace Node2 with the same zookeeper
version.
> > >>>> > I couldn't start zookeeper server on Node2 as it is continuously
> > >>>> stuck in
> > >>>> > leader election loop printing  following messages:
> > >>>> >
> > >>>> > 2014-02-26 02:45:23,709 [myid:3] - INFO
> > >>>> >  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@837]
> -
> > >>>> > Notification time out: 60000
> > >>>> > 2014-02-26 02:45:23,710 [myid:3] - INFO
> > >>>> >  [WorkerSender[myid=3]:QuorumCnxManager@195] - Have smaller
> server
> > >>>> > identifier, so dropping the connection: (5, 3)
> > >>>> > 2014-02-26 02:45:23,712 [myid:3] - INFO
> > >>>> >  [WorkerReceiver[myid=3]:FastLeaderElection@605] - Notification:
> 3
> > >>>> > (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state),
3
> > >>>> (n.sid), 0x0
> > >>>> > (n.peerEPoch), LOOKING (my state)1 (n.config version)
> > >>>> >
> > >>>> >
> > >>>> > Network connections and configuration of the node being upgraded
> are
> > >>>> fine.
> > >>>> > The other 2 nodes in the quorum are fine and serving the request.
> > >>>> >
> > >>>> > Any idea what might be causing this?
> > >>>> >
> > >>>> > Thanks & Regards,
> > >>>> > Deepak
> > >>>> >
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message