zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: ZK Client won't time out when quorum irrevocably goes away
Date Fri, 04 Feb 2011 00:41:21 GMT
Hi Ted, I just sent a followup before seeing this which says similar:

>>I supposed that the client could close it's session if it sees that
>>the disconnect happened long enough ago (the session timeout + some
>>safety factor). But this really is a special case (and 338 should be
>>implemented to address).

Sounds reasonable to me. Enter a jira.


On Thu, Feb 3, 2011 at 4:08 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Patrick,
> Is it really impossible for the client to say that soooo much time has
> passed in disconnected state that the session MUST have expired by now?
> I have heard this assertion before and it always irked me a bit, but Ryan's
> scenario is a great thought experiment (well, though experiment for US, not
> for him).  Why can't those clients decide the session is expired after 3
> days when the timeout is 3 minutes?
> On Thu, Feb 3, 2011 at 4:01 PM, Patrick Hunt <phunt@apache.org> wrote:
>> On Thu, Feb 3, 2011 at 2:57 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>> > The result was the client never realized that it's session was
>> > actually timed out, and the HBase processes continued to run. Kill -9
>> > and a restart fixed it.
>> Hi Ryan,
>> there are two issues at play here, session timeout and session
>> expiration. Correct me if I'm wrong but I think you meant to say "the
>> client never realized that it's session was actually _expired_". Which
>> is correct behavior. Clients can only determine if a session is
>> expired once they reconnect to the cluster. Session timeout on the
>> other hand happens when the server heartbeat is not received by the
>> client w/in the session timeout period. Clients who are disconnected
>> from the cluster will attempt to reconnect back to the cluster until
>> they are successful. When a client is disconnected the client's
>> watchers will be notified about the disconnect. (same for expiration).
>> See questions 1 & 2 here in the faq, specifically "Example state
>> transitions" in question 2:
>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ
>> Your clients were stuck btw steps 4 and 5 (which they will never reach
>> in your scenario).
>> Does that help?
>> Patrick

View raw message