hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Johnson <Thomas.John...@Sun.COM>
Subject Re: Simpler ZooKeeper event interface....
Date Thu, 08 Jan 2009 00:06:11 GMT

>>>  
>> I guess then I don't follow the leader election recipe. Is the 
>> following scenario possible in the leader election recipe:
>> 1) Leader L is partitioned from the ensemble.
>> 2) ZK servers expire its session.
>> 3) Some other follower F now becomes a leader.
>> 4) L and F form a split brain?
>>
>> I had wrongly assumed that the session was like a lease in that it 
>> allowed the client and server to independently know that the session 
>> had expired by the use of the global clock. Wouldn't it make sense 
>> for the client lib to expire its local session handle and never reuse 
>> it?
>
> Here's a good reason for each client to know it's session status 
> (connected/disconnected/expired). Depending on the application, if L 
> does not have a connected session to the ensemble it may need to be 
> careful how it acts.
>
> I'm trying to think though some cases...
>
> In the case of passive leader the followers will look at zk and only 
> send requests to the leader, so this seems fine (L no longer gets 
> requests, it syncs to the ensemble at some point and finds it's 
> session expired, it recovers as appropriate)
>
But depending on timing, couldn't the old leader still get a request 
from some follower who is lagging in terms of event receipt (or is 
disconnected - which brings up the question of dealing with 
disconnection at the follower)? Not sure how likely this is in practice 
... but I can't say I'm comfortable with all the theoretical 
possibilities at this point. In this case, a disconnected leader could 
play it safe and not accept new requests.
> In the case of an active leader, L continues to send commands 
> (whatever) to the followers. However a new leader L' has since been 
> elected and is also sending commands to the followers. In this case it 
> seems like either a) L should not send commands if it's not sync'd to 
> the ensemble (and holds the leader token) or b) followers should not 
> accept commands from non-leader (only accept from the current leader). 
> a) seems the right way to go; if L is disconnected it should stop 
> sending commands to the followers, if it's resync'd in time it can
Seems to make sense in this particular case (I had some other cases in 
mind that I'm not so sure about though)
> start sending commands again, otw it's session will expire, a new 
> leader L' elected and it will start sending commands to followers, 
> eventually L will resync and notice that it is no longer the leader 
> (and do whatever it takes to recover).
>
> > Wouldn't it make sense for the
> > client lib to expire its local session handle and never reuse it?
>
> I would think that depends on how expensive it is to change leaders. 
> It would be trivial for the client to close it's session and start a 
> new one each time it's notified of a disconnect from the ensemble.
>
Perhaps that's good enough. An alternative would be to wait for the 
timeout period.
> Patrick
>


Mime
View raw message