zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Camille Fournier <cami...@apache.org>
Subject Re: locking/leader election and dealing with session loss
Date Wed, 15 Jul 2015 18:33:16 GMT
To expand on my point, if you want to be able to continue to attempt to
make progress when the ZK is down, the act of getting a lock should also
cause the lock owner to get a sequence number that it can use to identify
the period of operation it is in. I believe that then, say, you get
sequence number 1. If you tag all of your requests with 1, if for any
reason you lose the lock and don't know it, and server #2 gets the lock, it
should get sequence #2. The resource should then reject all requests with
sequence below 2, so if any remaining requests tagged 1 are lying around
they should be rejected by the resource.
And there you have it: You can continue to make safe forward progress while
in uncertain state on the ZK side so long as the original lock holder is
available and the resource validates these things. If both the ZK itself go
down and the original lock holder goes down, you're still AWOL presumably.

C


On Wed, Jul 15, 2015 at 2:24 PM, Camille Fournier <camille@apache.org>
wrote:

> I thought that the client itself had a notion of the session timeout
> internally that would conservatively let the client know that it was dead?
> If not, then that's my faulty memory.
>
> That being said, if you really care about the client not sending messages
> when it does not have the lock, the resource under contention needs to
> validate the messages it is receiving, though. You cannot guarantee that
> just because a client believes it is connected and sends a message to
> locked resource that the message will be received while the sender still
> has the lock. If you don't care about this possibility then just assuming
> you lose the lock when you are in any state other than connected is
> adequate but just be aware that events such as long GC pauses and network
> issues can cause you to access the resource improperly.
>
> C
>
> On Wed, Jul 15, 2015 at 2:19 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> Once client A loses connection it must assume that it no longer has the
>> lock (you could try to time the session but I think that’s a bad idea).
>> Once you reconnect, you will know if your session is still active or not.
>> When done correctly, there’s no chance that both A and B will think they
>> own the lock at the same time.
>>
>> -Jordan
>>
>>
>>
>> On July 15, 2015 at 1:17:10 PM, Vikas Mehta (vikasmehta@gmail.com) wrote:
>>
>> Thanks for the quick response Camille. If client A owns the lock, gets
>> disconnected due to network partition, it will not see the SESSION_EXPIRED
>> event until it is too late, i.e. client B has acquired the lock and done
>> the
>> damage. Problem here is that client cannot distinguish network partition
>> from zookeeper ensemble in leader election state.
>>
>>
>>
>> --
>> View this message in context:
>> http://zookeeper-user.578899.n2.nabble.com/locking-leader-election-and-dealing-with-session-loss-tp7581277p7581279.html
>> Sent from the zookeeper-user mailing list archive at Nabble.com.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message