hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Debugging help for SessionExpiredException
Date Wed, 09 Jun 2010 18:47:49 GMT
"100mb partition"? sounds like virtualization. resource starvation 
(worse in virtualized env) is a common cause of this. Are your clients 
gcing/swapping at all? If a client gc's for long periods of time the 
heartbeat thread won't be able to run and the server will expire the 
session. There is a min/max cap that the server places on the client 
timeouts (it's negotiated), check the client log for detail on what 
timeout it negotiated (logged in 3.3 releases)

take a look at this and see if you can make progress:
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting

My guess is that your client is gcing for long periods of time - you can 
rule this in/out by turning on gc logging in your clients and then 
viewing the results after another such incident happens (try gchisto for 
graphical view)

Patrick

On 06/09/2010 11:36 AM, Jordan Zimmerman wrote:
> We have a test system using Zookeeper. There is a single Zookeeper
> server node and 4 clients. There is very little activity in this
> system. After a day's testing we start to see SessionExpiredException
> on the client. Things I've tried:
>
> * Increasing the session timeout to 1 minute * Making sure all JVMs
> are running in a 100MB partition
>
> Any help debugging this problem would be appreciated. What kind of
> diagnostics should can I add? Are there more config parameters that I
> should try?
>
> -JZ

Mime
View raw message