curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: LOST ConnectionState
Date Tue, 08 Oct 2013 21:13:30 GMT
Thanks Jordan,
I think I've just misinterpreted how the LOST event was implemented.

My thinking was that application code can be slightly more optimistic under
certain circumstances than the documentation suggests. The Curator doco
indicates that the client should suspend all locks etc. once it sees a
SUSPEND event and completely restart them if it sees a LOST event. My
thinking was that for cases like a standard Mutex lock that can only be
held by a single client at a time, it can assume that it still holds the
lock after it sees a SUSPEND event until 2/3 of the session timeout has
expired and it hasn't reconnected. 2/3 because the ZooKeeper client pings
every 1/3 of the session timeout, so in the worst case the connection will
have dropped just before a ping has occurred.

Anyway, thanks for the clarification.
cheers
Cam


On Wed, Oct 9, 2013 at 6:09 AM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> Just wondering how the LOST connection state is determined? I would have
>> thought that it would be safe to be in a SUSPENDED connection state until
>> somewhere close to the session timeout was reached. From my experimentation
>> though it seems that the LOST state isn't related to either the session
>> timeout or the connection timeout.
>>
> When Curator gets a disconnection event, it sets the state to SUSPENDED
> and executes a sync() in the background (using retries, etc.). If that
> sync() fails, it sets the state to LOST. However, if Curator sees an
> expired session event, it goes straight to LOST.
>
> From my experimentation though it seems that the LOST state isn't related
>> to either the session timeout or the connection timeout.
>>
> There is a relationship, but it isn't 1-to-1.
>
> If I have a 5 second session timeout configured for the Curator
> connection, it takes (in my case) 9 seconds between the SUSPENDED state and
> the LOST state. Given that the session is expired on the server side well
> before the LOST state is received, this seems incorrect.
>
> The timeouts are not necessarily related to connection state. I don't
> think this is implied in the docs. If it is, the docs should be updated.
>
> -JZ
>
>
> On Oct 7, 2013, at 6:37 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
>
> Looking further into this, I think that it could be considered a bug.
>
> If I have a 5 second session timeout configured for the Curator
> connection, it takes (in my case) 9 seconds between the SUSPENDED state and
> the LOST state. Given that the session is expired on the server side well
> before the LOST state is received, this seems incorrect.
>
> Any thoughts?
>
>
> On Wed, Oct 2, 2013 at 4:17 PM, Cameron McKenzie <mckenzie.cam@gmail.com>wrote:
>
>> Hi,
>> Just wondering how the LOST connection state is determined? I would have
>> thought that it would be safe to be in a SUSPENDED connection state until
>> somewhere close to the session timeout was reached. From my experimentation
>> though it seems that the LOST state isn't related to either the session
>> timeout or the connection timeout.
>>
>> Is there some rationale behind this?
>>
>> My thinking was that for leader election, locks etc. that rely on
>> ephemeral nodes, we can be sure that these nodes are going to exist for as
>> long as the session timeout, and thus we can be disconnected from ZooKeeper
>> for up to the session timeout (with a bit of leeway for safety) and still
>> assume that our ephemeral nodes are present. For leader election or locks
>> where it is not possible for another client to come and 'steal' this
>> function from the client isn't this a safe assumption?
>>
>> Or am I missing something?
>> cheers
>> Cam
>>
>
>
>

Mime
View raw message