hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Debugging help for SessionExpiredException
Date Tue, 15 Jun 2010 23:15:10 GMT
I'm not very experienced personally with running zk on ec2 smalls, Ted 
usually has the ec2 related insight. Given these boxes are not loaded or 
lightly loaded, and you've ruled out gc/swap, the only thing I can think 
of is that something is going on under the covers at the vm level that's 
causing the high latency you're seeing.

You're seeing 15 _minutes_ max latency. I can't think of what would 
cause that inside zk. Any chance that the VM is shutting down or 
"freezing" during that period? I dont' know. Are you monitoring that 
system from a second system? Perhaps that might shed some light (monitor 
the cpu/disk activity using some monitoring tool like ganglia, nagios, 
etc... or even more primitive, perhaps doing a ping to that system and 
tracking the round trip time/packet loss, dump to a file and review the 
next day, etc...)

Patrick

On 06/15/2010 03:59 PM, Jordan Zimmerman wrote:
> They're small instances. The thing is that these machines are doing
> next to no work. We're just running simple little tests. The session
> expiration has not happened while I've been watching. It tends to
> happen over night.
>
> -JZ
>
> On Jun 15, 2010, at 1:50 PM, Ted Dunning wrote:
>
>> As usual, the ZK team provides the best feedback.
>>
>> I would be bold enough to ask what kind of ec2 instances you are
>> running on.  Small instances are small chunks of larger machines
>> and are sometimes subject to competition for resources from the
>> other tenants.
>>
>> On Tue, Jun 15, 2010 at 12:30 PM, Patrick Hunt<phunt@apache.org>
>> wrote: 3) under-provisioned virtual machines (ie vmware)
>>
>> ...
>>
>> Given that you've ruled out the gc (most common), disk utilization
>> would be the next thing to check.
>>
>
>

Mime
View raw message