zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Debugging help for SessionExpiredException
Date Wed, 09 Jun 2010 23:21:21 GMT

On 06/09/2010 03:35 PM, Lei Zhang wrote:

> We've consistently run into issues with vmware workstation (CentOS as guest
> OS) on Windows host: just by leaving the cluster idle over night leads to zk
> session expire issue. My theory is: windows may have gone to hibernation,
> the zk heartbeat logic hibernates, session expire exception is thrown the
> moment windows is taken out of hibernation.

That sounds like a possible scenario.

> On EC2 (still CentOS as guest OS), we consistently run into zk session
> expire issue when our cluster is under heavy load. I am planning to raise
> scheduling priority of zk server, but haven't done testing.

Before you take any action you might examine a few things to identify 
what's biting you:

this has some good general detail on issues other users have seen:

In particular you might look at GC/swapping on your clients, that's the 
most common case we see for session expiration (apart from the obvious 
-- network level connectivity failures). In one case I remember there 
was very heavy network load for a period of time once per day, this was 
causing some issue on the switches which would result in occassional 
session expiration, but only during this short window. This was pretty 
hard to track down. Are you monitoring network connectivity in general? 
Is it possible that temporary network outages are causing this? Perhaps 
take a look at both your server and client ZK logs, see if the client is 
seeing anything other than the session expiration (is the client seeing 
session TIMED OUT for example, this happens when the client doesn't hear 
back from the server, while session expiration happens because the 
server doesn't hear from the client).

Good luck,


View raw message