zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: locking/leader election and dealing with session loss
Date Thu, 16 Jul 2015 11:38:18 GMT
Are you really seeing 30s gc pauses in production? If so, then of course this could happen.
However, if your application can tolerate a 30s pause (which is hard to believe) then your
session timeout is too low. The point of the session timeout is to have enough coverage. So,
if your app has 30 seconds allowable pauses your session timeout would have to be much longer.

-JZ



On July 16, 2015 at 4:35:36 AM, Ivan Kelly (ivank@apache.org) wrote:

In case there's still doubt around this issue. I've written a demo app that demonstrates the
problem.

https://github.com/ivankelly/hanging-chad

-Ivan

On Wed, Jul 15, 2015 at 11:22 PM Alexander Shraer <shralex@gmail.com> wrote:
I disagree, ZooKeeper itself actually doesn't rely on timing for safety -
it won't get into an inconsistent state even if all timing assumptions fail
(except for the sync operation, which is then not guaranteed to return the
latest value, but that's a known issue that needs to be fixed).




On Wed, Jul 15, 2015 at 2:13 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> This property may hold if you make a lot of timing/synchrony assumptions
>
> These assumptions and timing are intrinsic to using ZooKeeper. So, of
> course I’m making these assumptions.
>
> -Jordan
>
>
>
> On July 15, 2015 at 3:57:12 PM, Alexander Shraer (shralex@gmail.com)
> wrote:
>
> This property may hold if you make a lot of timing/synchrony assumptions
> -- agreeing on who holds the lock in an asynchronous distributed system
> with failures is impossible, this is the FLP impossibility.
>
> But even if it holds, this property is not very useful if the ZK client
> itself doesn't have the application data. So one has to consider whether it
> is possible that the application sees a messages from two clients that both
> think are the leader in an order which contradicts the lock acquisition
> order.
>
> On Wed, Jul 15, 2015 at 1:26 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>>  I think we may be talking past each other here. My contention (and the
>> ZK docs agree BTW) is that, properly written and configured, "at any
>> snapshot in time no two clients think they hold the same lock”. How your
>> application acts on that fact is another thing. You might need sequence
>> numbers, you might not.
>>
>> -Jordan
>>
>>
>> On July 15, 2015 at 3:15:16 PM, Alexander Shraer (shralex@gmail.com)
>> wrote:
>>
>>  Jordan, as Camille suggested, please read Sec 2.4 in the Chubby paper:
>> link
>> <
>> http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
>> >
>>
>> it suggests 2 ways in which the storage can support lock generations and
>> proposes an alternative for the case where the storage can't be made aware
>> of lock generations.
>>
>> On Wed, Jul 15, 2015 at 1:08 PM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>> > Ivan, I just read the blog and I still don’t see how this can happen.
>> > Sorry if I’m being dense. I’d appreciate a discussion on this. In your
>> blog
>> > you state: "when ZooKeeper tells you that you are leader, there’s no
>> > guarantee that there isn’t another node that 'thinks' its the leader.”
>> > However, given a long enough session time — I usually recommend 30–60
>> > seconds, I don’t see how this can happen. The client itself determines
>> that
>> > there is a network partition when there is no heartbeat success. The
>> > heartbeat is a fraction of the session timeout. Once the heartbeat
>> fails,
>> > the client must assume it no longer has the lock. Another client cannot
>> > take over the lock until, at minimum, session timeout. So, how then can
>> > there be two leaders?
>> >
>> > -Jordan
>> >
>> > On July 15, 2015 at 2:23:12 PM, Ivan Kelly (ivank@apache.org) wrote:
>> >
>> > I blogged about this exact problem a couple of weeks ago [1]. I give an
>> > example of how split brain can happen in a resource under a zk lock
>> (Hbase
>> > in this case). As Camille says, sequence numbers ftw. I'll add that the
>> > data store has to support them though, which not all do (in fact I've
>> yet
>> > to see one in the wild that does). I've implemented a prototype that
>> works
>> > with hbase[2] if you want to see what it looks like.
>> >
>> > -Ivan
>> >
>> > [1]
>> >
>> >
>> https://medium.com/@ivankelly/reliable-table-writer-locks-for-hbase-731024295215
>> > [2] https://github.com/ivankelly/hbase-exclusive-writer
>> >
>> > On Wed, Jul 15, 2015 at 9:16 PM Vikas Mehta <vikasmehta@gmail.com>
>> wrote:
>> >
>> > > Jordan, I mean the client gives up the lock and stops working on the
>> > shared
>> > > resource. So when zookeeper is unavailable, no one is working on any
>> > shared
>> > > resource (because they cannot distinguish network partition from
>> > zookeeper
>> > > DEAD scenario).
>> > >
>> > >
>> > >
>> > > --
>> > > View this message in context:
>> > >
>> >
>> http://zookeeper-user.578899.n2.nabble.com/locking-leader-election-and-dealing-with-session-loss-tp7581277p7581293.html
>> > > Sent from the zookeeper-user mailing list archive at Nabble.com.
>> > >
>> >
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message