zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: downsides of disabling ZooKeeper client session timeouts?
Date Fri, 15 Jul 2011 17:45:08 GMT
These concepts seem like apples and oranges.

The automatic reconnect is something that is intended to compensate for
server outage or network failure that separates the server from the client.
 The session expiration is a critical mechanism for detecting when the
client has died or been separated from the server.

As such, these mechanisms can be viewed as duals of each other.  Automatic
reconnection  is client-centric, session expiration tells the server cluster
when a client has died and is thus server-centric.

In general, setting the session timeout to a very large number makes
ephemeral nodes completely useless.

Very long timeouts can also have other adverse impacts on the system
internals.  I don't think that session timeouts don't have this issue, but
some other timeouts require that internal buffer and log sizes be increased.

For testing your client, you need to

a) emulate a connection loss and verify that your client goes into some sort
of safe mode for the duration of the disconnect.

b) emulate a session expiration and verify that the rest of the cluster does
the right thing when the session expires but the client has not reconnected.

c) emulate a session expiration and verify that the client does the right
thing when notified of a session expiration.

You also need to examine what happens if client transactions to the server
fail and also what happens if failure is reported (by connection loss) but
the operation actually succeeded.  Your application cannot assume that lack
of a successful return result implies that the operation failed.

On Fri, Jul 15, 2011 at 9:38 AM, willjohnsonsearch <
willjohnsonsearch@gmail.com> wrote:

> Can someone describe the difference between automatically reconnecting the
> client and setting the session timeout to an infinite number?  Ted
> mentioned
> "network partitions and other failures" and Scott gave some ways to
> simulate
> the disconnects but how would that manifest in the client?  (all possible
> cases, not just a few)  If i'm going to go to the trouble to handle these
> cases i want to make sure i have test cases to prove that what i'm doing is
> correct and more importantly, complete.
> --
> View this message in context:
> http://zookeeper-user.578899.n2.nabble.com/downsides-of-disabling-ZooKeeper-client-session-timeouts-tp6346179p6587455.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message