zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@gmail.com>
Subject Re: Data loss scenario
Date Wed, 20 Aug 2014 21:56:37 GMT
I think its:

src/java/main/org/apache/zookeeper/server/quorum/Leader.java,
waitForEpochAck throws exception if the follower is ahead of the leader in
terms of data, like in your example

src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java, run()
throws exception if follower has a more up-to-date configuration than
leader.

Since a leader needs support from a quorum, when trying to become leader
one of the servers who knows about d3 will need to connect to it (since d3
was committed and every two majorities intersect). So C will not be able to
gather the required support without triggering the checks above.

In fact C is very unlikely to get that far as to try to become the leader -
as Henry mentioned ZooKeeper has a preliminary protocol called
FastLeaderElection.java which tries to make sure that the candidate leader
has the most up-to-date data and support from a quorum. This is how the
candidate is chosen and then the other servers establish connections to
this candidate. The checks above are in case by the time connections are
established to the candidate leader some server from whom he previously
didn't hear in FastLeaderElection tries to connect and the candidate leader
discovers that he shouldn't really be the leader. Then he gives up and
returns back to FastLeaderElection.





On Wed, Aug 20, 2014 at 10:42 AM, Gaurav Saxena <gsaxena81@gmail.com> wrote:

> Thanks! That's great... If someone can point me to the code where this is
> decided, it will be a great help... as I have to present evidence that this
> scenario will not happen
>
>
> On Wed, Aug 20, 2014 at 10:33 AM, Henry Robinson <henry@cloudera.com>
> wrote:
>
> > IIRC, C cannot become the master because it does not have all the changes
> > that A and B have seen. The leader election protocol can take care of
> > ensuring the invariant that the elected master must be the most
> up-to-date
> > of all peers. (Alternatively, the new master can request the missing log
> > suffix from the peers during election, but I believe, although it's a
> while
> > since I checked, that ZK does the former. Someone can fill in the
> details /
> > correct me).
> >
> > Henry
> >
> >
> > On 20 August 2014 10:24, Gaurav Saxena <gsaxena81@gmail.com> wrote:
> >
> > > I am curious about a seemingly data loss scenario. I describe it below
> > >
> > > There are three zookeeper servers A, B, and C.
> > > 1. At one point in time t1 the state of the system is as follows:
> > > A is up and contains data d1, d2. A is master
> > > B is up and contains data d1, d2
> > > C is up and contains data d1, d2
> > >
> > > 2. At time t2 C goes down. The state of the system at t2 is
> > > A is up and contains data d1, d2. A is master
> > > B is up and contains data d1, d2
> > > C is down and its log contains data d1, d2
> > >
> > > 3. At time t3 the state of the system changes
> > > A is up and contains data d1, d2, d3. A is master
> > > B is up and contains data d1, d2, d3
> > > C is down and its log contains data d1, d2
> > >
> > > 4. At time t4, C comes up and also becomes the master, while A and B
> are
> > > also up
> > >
> > > Question: Because C is master, will the logs of A and B be truncated to
> > > contain only d1 and d2? Is this considered a data loss scenario? If
> yes,
> > is
> > > there an issue around it?
> > >
> > > --
> > > Regards
> > > Gaurav Saxena
> > >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> >
>
>
>
> --
> Regards
> Gaurav Saxena
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message