zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinayak Khot <vina...@nutanix.com>
Subject Re: Possible issue with cluster availability following new Leader Election - ZK 3.4
Date Wed, 16 May 2012 18:55:20 GMT
We also have encountered a problem where the newly elected leader
sends entire
snapshot to a follower even though the follower is in sync with the leader.

A closer look at the code shows the problem in the logic where we decide to
send
a snapshot.
Following scenario explains the problem in details.
Start a 3 node Zookeeper ensemble where every quorum member has seen same
changes.
zxid: *0x400000004*

1. When a newly elected leader starts, it bumps up its zxid to the new
epoch.

Code snippet Leader.java

long epoch = getEpochToPropose(self.getId(), self.getAcceptedEpoch());
zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
synchronized(this){
     lastProposed = zk.getZxid();  // *0x500000000*
}

2. Now a follower tries to join the leader with its peerLastZxid = *
0x400000004*

Note that now the leader has in memory committedLog list with* *
maxCommittedLog=*0x400000004** *
*
*
As committedLog don't have any new transactions which have zxid >
peerLastZxid, we check if
the leader and follower are in sync.

Code snippet from LearnerHandler.java
leaderLastZxid = leader.startForwarding(this, updates);
if (peerLastZxid == leaderLastZxid) {   *0x400000004 == **0x500000000*
   // We are in sync so we'll do an empty diff
   packetToSend = Leader.DIFF;
   zxidToSend = leaderLastZxid;
}

Note that the function *leader.startForwarding()* returns *lastProposed *zxid
which is already set to
*0x500000000 *by the leader.
So in this scenario we never send empty diff even though the leader and
follower are in sync,
and we end up sending entire snapshot in the code that follows above check.

A possible fix would be to keep *lastProcessedZxid* in the leader which
will get updated only when
the leader processes a transaction. While syncing with a follower, if the
peerLastZxid sent by a follower
is same as lastProcessedZxid of the leader we can send empty diff to the
follower.
This shall avoid unnecessarily sending entire snapshot when the leader and
follower are already in sync.

Zookeeper developers please share your views on above mentioned issue.

- Vinayak

On Mon, May 14, 2012 at 8:30 AM, Camille Fournier <camille@apache.org>wrote:

> Thanks.
> I just ran a couple of tests to start the debugging. Mark, I don't see
> a long cluster settle with a mostly empty data set, so I think this
> might be two different problems. I do see a lot of snapshots being
> sent though so there is probably some overaggressiveness in the way
> that we evaluate when to send snapshots that should be evaluated.
> Adding the dev mailing list, as I may need ben or flavio to take a
> look as well.
>
> C
>
> On Thu, May 10, 2012 at 10:48 AM,  <Alexandar.Gvozdenovic@ubs.com> wrote:
> > Cheers - Raised https://issues.apache.org/jira/browse/ZOOKEEPER-1465
> >
> >
> >
> > -----Original Message-----
> > From: Camille Fournier [mailto:camille@apache.org]
> > Sent: 10 May 2012 14:58
> > To: user@zookeeper.apache.org
> > Subject: Re: Possible issue with cluster availability following new
> Leader Election - ZK 3.4
> >
> > I will take a look at this soon, have you created a Jira for it? If not
> please do so.
> >
> > Thanks,
> > C
> >
> > On Thu, May 10, 2012 at 7:20 AM,  <Alexandar.Gvozdenovic@ubs.com> wrote:
> >> I think there may be a problem here with the 3.4 branch. I dropped the
> >> cluster back to 3.3.5 and the behaviour was much better.
> >>
> >> To summarize:
> >>
> >> 650mb of data
> >> 20k nodes of varied size
> >> 3 node cluster
> >>
> >> On 3.4.x (using latest branch build)
> >> ---------
> >> Takes 3-4 minutes to bring up a cluster from cold Takes 40-50 secs to
> >> recover from a leader failure Takes 10 secs for a new follower to join
> >> the cluster
> >>
> >> On 3.3.5
> >> --------
> >> Takes 10-20 secs to bring up a cluster from cold Takes 10 secs to
> >> recover from a leader failure Takes 10 secs for a new follower to join
> >> the cluster
> >>
> >> Any views on this from the ZK devs? The differences in behaviour only
> >> start becoming apparent as the dataset gets bigger.
> >> I was hoping to use 3.4 for the transactional features it offered via
> >> the 'multi-update' operations, but this issue seems pretty serious...
> >>
> >>
> >>
> >> Visit our website at http://www.ubs.com
> >>
> >> This message contains confidential information and is intended only
> >> for the individual named. If you are not the named addressee you
> >> should not disseminate, distribute or copy this e-mail. Please notify
> >> the sender immediately by e-mail if you have received this e-mail by
> >> mistake and delete this e-mail from your system.
> >>
> >> E-mails are not encrypted and cannot be guaranteed to be secure or
> >> error-free as information could be intercepted, corrupted, lost,
> >> destroyed, arrive late or incomplete, or contain viruses. The sender
> >> therefore does not accept liability for any errors or omissions in the
> >> contents of this message which arise as a result of e-mail transmission.
> >> If verification is required please request a hard-copy version. This
> >> message is provided for informational purposes and should not be
> >> construed as a solicitation or offer to buy or sell any securities or
> >> related financial instruments.
> >>
> >> UBS Limited is a company limited by shares incorporated in the United
> >> Kingdom registered in England and Wales with number 2035362.
> >> Registered office: 1 Finsbury Avenue, London EC2M 2PP.  UBS Limited is
> >> authorised and regulated by the Financial Services Authority.
> >>
> >> UBS AG is a public company incorporated with limited liability in
> >> Switzerland domiciled in the Canton of Basel-City and the Canton of
> >> Zurich respectively registered at the Commercial Registry offices in
> >> those Cantons with Identification No: CH-270.3.004.646-4 and having
> >> respective head offices at Aeschenvorstadt 1, 4051 Basel and
> >> Bahnhofstrasse 45, 8001 Zurich, Switzerland.  Registered in the United
> >> Kingdom as a foreign company with No: FC021146 and having a UK
> >> Establishment registered at Companies House, Cardiff, with No:
> >> BR 004507.  The principal office of UK Establishment: 1 Finsbury
> >> Avenue, London EC2M 2PP.  In the United Kingdom, UBS AG is authorised
> >> and regulated by the Financial Services Authority.
> >>
> >> UBS reserves the right to retain all messages. Messages are protected
> >> and accessed only in legally justified cases.
> > Visit our website at http://www.ubs.com
> >
> > This message contains confidential information and is intended only
> > for the individual named. If you are not the named addressee you
> > should not disseminate, distribute or copy this e-mail. Please
> > notify the sender immediately by e-mail if you have received this
> > e-mail by mistake and delete this e-mail from your system.
> >
> > E-mails are not encrypted and cannot be guaranteed to be secure or
> > error-free as information could be intercepted, corrupted, lost,
> > destroyed, arrive late or incomplete, or contain viruses. The sender
> > therefore does not accept liability for any errors or omissions in the
> > contents of this message which arise as a result of e-mail transmission.
> > If verification is required please request a hard-copy version. This
> > message is provided for informational purposes and should not be
> > construed as a solicitation or offer to buy or sell any securities
> > or related financial instruments.
> >
> > UBS Limited is a company limited by shares incorporated in the United
> > Kingdom registered in England and Wales with number 2035362.
> > Registered office: 1 Finsbury Avenue, London EC2M 2PP.  UBS Limited
> > is authorised and regulated by the Financial Services Authority.
> >
> > UBS AG is a public company incorporated with limited liability in
> > Switzerland domiciled in the Canton of Basel-City and the Canton of
> > Zurich respectively registered at the Commercial Registry offices in
> > those Cantons with Identification No: CH-270.3.004.646-4 and having
> > respective head offices at Aeschenvorstadt 1, 4051 Basel and
> > Bahnhofstrasse 45, 8001 Zurich, Switzerland.  Registered in the
> > United Kingdom as a foreign company with No: FC021146 and having a
> > UK Establishment registered at Companies House, Cardiff, with No:
> > BR 004507.  The principal office of UK Establishment: 1 Finsbury Avenue,
> > London EC2M 2PP.  In the United Kingdom, UBS AG is authorised and
> > regulated by the Financial Services Authority.
> >
> > UBS reserves the right to retain all messages. Messages are protected
> > and accessed only in legally justified cases.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message