zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Vinod Johnson <Thomas.John...@Sun.COM>
Subject Re: What happens when a server loses all its state?
Date Wed, 17 Dec 2008 16:37:12 GMT
Mahadev Konar wrote:
> Hi Thomas,
>  Here is what would happen in the scenario you mentioned.
>> Great - thanks Mahadev.
>> Not to drag this on more than necessary, please bear with me for one
>> more example of 'amnesia' that comes to mind. I have a set of ZooKeeper
>> servers A, B, C.
>> - C is currently not running, A is the leader, B is the follower.
>> - A proposes zxid1 to A and B, both acknowledge.
>> - A asks A to commit (which it persists), but before the same commit
>> request reaches B, all servers go down (say a power failure).
> In this case, the zookeeper protocol says that zxid1 would be available only
> if the client gets a success. So zxid1 may or may not get committed if A and
> B come up later. ( this is a different scenario then what you mention
> later).
The general scenario I was interested in was a minority of servers 
losing state, and trying to understand what other correlated events 
could cause issues. Just to be clear, since A has sent the commit to B 
(or is it when A has got its own commit), it *could have* sent a success 
back to the client before everything went down, correct?
>> - Later, B and C come up (A is slow to reboot), but B has lost all state
>> due to disk failure.
> This is how zookeeper would work in this scenario ---
> Now since we have B and C come up and B has the most recent state but loses
> it, then zookeeper is clueless about this. So C would say I have the some
> zxid say zxid-n and B would say that I have zxid = 0 (since its stateless)
> and C would become a leader (since it has the highest zxid).
> This would lead to loss of data and loss of state in zookeeper. That's what
> I meant when I mentioned that zookeeper relies heavily on the state being
> persisted on disk.
OK good, my understanding was correct then.
>> - C becomes the new leader and perhaps continues with some more new
>> transactions.
> Now if A comes back again, C would say that its the leader and ask A to
> truncate all the transactions that A had to come to sync with C.
I wasn't aware that C would ask A to truncate even committed 
transactions (the zookeeper internals doc/slides talks about proposals - 
I suspect I may have some terminology confusion here). Another 
possibility is C is now at zxid2 >= zxid1, in which case A could 
possibly *not* get rid of the committed transaction?
> Again, you can see that how persistence loss can trigger state loss in
> zookeeper. If its just minority of servers failing then this can be taken
> care of by zookeeper but in this scenario is C failing and then being
> brought up with an inconsisten state with another failure of A and data loss
> of B -- which zookeeper cannot handle.
> I hope this helps. 
Yes thanks. Not sure if this makes sense, but is it worthwhile to have a 
'safe' mode when a server comes up with no state (I think it should be 
simple to distinguish between having a clean disk 'no state'/corrupt 
state and 'empty state')? In this case, I think it could simply wait 
till it sees a successful propose/commit cycle to know that it is safe 
for it to take a snapshot and start participating in the ensemble.

In the scenario I previously described, when B and C comes up, B would 
not respond to C, but just watch - C would not be able to establish 
quorum until A came up; at which point B has witnessed a successful 
leader activation, and can join. If one is willing to sacrifice liveness 
for safety in situations where 1 or more nodes have amnesia, would this 
be a viable option?

View raw message