zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Reed <br...@yahoo-inc.com>
Subject RE: Dealing with session expired
Date Thu, 12 Feb 2009 21:11:55 GMT
idleness is not a problem. the client library sends heartbeats to keep the session alive. the
client library will also handle reconnects automatically if a server dies.

since session expiration really is a rare catastrophic event. (or at least it should be.)
it is probably easiest to deal with it by starting with a fresh instance if your session expires.

From: Tom Nichols [tmnichols@gmail.com]
Sent: Thursday, February 12, 2009 11:53 AM
To: zookeeper-user@hadoop.apache.org
Subject: Re: Dealing with session expired

I'm using a timeout of 5000ms.  Now let me ask this:  Suppose all of
my clients are waiting on some external event -- not ZooKeeper -- so
they are all idle and are not touching ZK nodes, nor are they calling
exists, getChildren, etc etc.  Can that idleness cause session expiry?

I'm running a local quorum of 3 nodes.  That is, I have an Ant script
that kicks off 3 <java> tasks in parallel to run ConsumerPeerMain,
each with its own config file.

Regarding handling of the failure, I suspect I will just have to
reinitialize by creating a new instance of my client(s) that
themselves will have a new ZK instance.  I'm using Spring to wire
everything together, which is why it's particularly difficult to
simply re-create a new ZK instance and pass it to the classes using it
(those classes have no knowledge of each other).  But I _can_ just
pull a freshly-created (prototype) instance from the Spring
application context, which is where a new ZK client will be wired in.

The only ramification there is I have to throw the KeeperException as
a fatal exception rather than letting that client try to re-elect.  Or
maybe add in some logic to say "if I can't re-elect, _then_ throw an
exception and consider it fatal."

Thanks guys.


On Thu, Feb 12, 2009 at 2:39 PM, Patrick Hunt <phunt@apache.org> wrote:
> Regardless of frequency Tom's code still has to handle this situation.
> I would suggest that the "two classes" Tom is referring to in his mail, the
> ones that use ZK client object, should either be able to "reinitialize" with
> a new zk session, or they themselves should be discarded and new instances
> created using the new session (not sure what makes more sense for his
> archi...)
> Regardless of whether we reuse the session object or create a new one I
> believe the code using the session needs to "reinitialize" in some way --
> there's been a dramatic break from the cluster.
> As I mentioned, you can decrease the likelihood of expiration by increasing
> the timeout - but the downside is that you are less sensitive to clients
> dying (because their ephemeral nodes don't get deleted till close/expire and
> if you are doing something like leader election among your clients it will
> take longer for the followers to be notified).
> Patrick
> Mahadev Konar wrote:
>> Hi Tom,
>>  The session expired event means that the the server expired the client
>> and
>> that means the watches and ephemrals will go away for that node.
>> How are you running your zookeeper quorum? Session expiry event should be
>> really rare event . If you have a quorum of servers it should rarely
>> happen.
>> mahadev
>> On 2/12/09 11:17 AM, "Tom Nichols" <tmnichols@gmail.com> wrote:
>>> So if a session expires, my ephemeral nodes and watches have already
>>> disappeared?  I suppose creating a new ZK instance with the old
>>> session ID would not do me any good in that case.  Correct?
>>> Thanks.
>>> -Tom
>>> On Thu, Feb 12, 2009 at 2:12 PM, Mahadev Konar <mahadev@yahoo-inc.com>
>>> wrote:
>>>> Hi Tom,
>>>>  We prefer to discard the zookeeper instance if a session expires.
>>>> Maintaining a one to one relationship between a client handle and a
>>>> session
>>>> makes it much simpler for users to understand the existence and
>>>> disappearance of ephemeral nodes and watches created by a zookeeper
>>>> client.
>>>> thanks
>>>> mahadev
>>>> On 2/12/09 10:58 AM, "Tom Nichols" <tmnichols@gmail.com> wrote:
>>>>> I've come across the situation where a ZK instance will have an
>>>>> expired connection and therefore all operations fail.  Now AFAIK the
>>>>> only way to recover is to create  a new ZK instance with the old
>>>>> session ID, correct?
>>>>> Now, my problem is, the ZK instance may be shared -- not between
>>>>> threads -- but maybe two classes in the same thread synchronize on
>>>>> different nodes by using different watchers.  So it makes sense that
>>>>> one ZK client instance can handle this.  Except that even if I detect
>>>>> the session expiration by catching the KeeperException, if I want to
>>>>> "resume" the session, I have to create a new ZK instance and pass it
>>>>> to any classes who were previously sharing the same instance.  Does
>>>>> this make sense so far?
>>>>> Anyway, bottom line is, it would be nice if a ZK instance could itself
>>>>> recover a session rather than discarding that instance and creating a
>>>>> new one.
>>>>> Thoughts?
>>>>> Thanks in advance,
>>>>> -Tom

View raw message