zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: Failure scenarios and consequences
Date Thu, 09 Dec 2010 21:29:27 GMT
I created a link off of the main wiki and the page itself:
http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios

Would someone please review it?  Specifically, I am curious to know about this:
"if the leader is in the non-quorum side of the partition, that side of the partition will
recognize that it no longer has a quorum of the ensemble. The leader will be demoted to being
a regular ZooKeeper server and those nodes will no longer accept reads or writes."
I just wanted to clarify - in the time for the non-quorum side to recognize it is no longer
a quorum, will there ever be writes that get through?  Is it guaranteed that it won't accept
writes after the partition?  I don't think that guarantee can exist, but wondered how to handle
that.

On Dec 9, 2010, at 2:04 PM, Mahadev Konar wrote:

> Hi Jeremy,
>   Responses in line below:
> 
> On 12/9/10 11:53 AM, "Jeremy Hanna" <jeremy.hanna1234@gmail.com> wrote:
> 
> I looked around on the wiki and in the user list archives and couldn't find something
definitive about certain failure scenarios.
> 
> A partition splits the ensemble where a quorum is on one side of the partition
> -- if the leader is on the quorum side of the partition, what happens to reads/writes
that go to the non-quorum side?  I assume writes return errors because it can't get to the
leader.  Reads?
> 
>> The reads will also fail on all the quorum nodes until a new quorum is elected.
> 
> -- if the leader is on the non-quorum side of the partition, I would assume that the
quorum side of the partition would elect a new leader for those clients on its side of the
partition.  However, is there the possibility for the leader on the non-quorum side to accept
writes before it realizes that there's no longer a quorum?  Just wondering about the possibility
of corruption and then when the cluster syncs back up how the cluster would handle that data.
> 
>> No there isnt. The leader relinquishes its right as a leader as soon as it realizes
a quorum isnt committing the changes it proposed.
> 
> (I would be happy to create a wiki page for failure scenarios if one doesn't exist that
people could add to, but maybe this is just common knowledge.)
> 
>> Please do!
> 
> thanks
> mahadev


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message