curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Stribling <st...@vmware.com>
Subject Re: adding a "network timeout" to curator?
Date Thu, 27 Feb 2014 05:45:39 GMT
Please correct me if I'm wrong, but I thought Curator went into 
SUSPENDED mode when it gets a Disconnected state event from its ZK 
client.  That is not necessarily the same as a network issue, because 
that ZK keepalive could be stuck in the ZK server processing queue, 
blocked on a slow disk.  What I'm proposing would be a true, 
network-only timeout that could be used to declare a client disconnected 
quickly if there's a network issue, without having to reduce the ZK 
session timeout so low that a slow disk would cause false negatives.  
Does that make sense?

Jeremy

On 02/26/2014 09:25 PM, Jordan Zimmerman wrote:
> Curator should already go into SUSPENDED when there is a connection 
> issue, right? How would this be different?
>
> -JZ
>
> ------------------------------------------------------------------------
> From: Jeremy Stribling Jeremy Stribling <mailto:strib@vmware.com>
> Reply: user@curator.apache.org user@curator.apache.org 
> <mailto:user@curator.apache.org>
> Date: February 26, 2014 at 7:56:26 AM
> To: user@curator.apache.org user@curator.apache.org 
> <mailto:user@curator.apache.org>
> Subject: adding a "network timeout" to curator?
>> Hi all,
>>
>> I started a thread on the ZK list a while back about timeouts in ZK.
>> You can find it in the archives here:
>>
>> http://mail-archives.apache.org/mod_mbox/zookeeper-user/201309.mbox/%3C522F7A9D.20800@nicira.com%3E

>>
>>
>> The basic idea is that when ZK is running on a node with slow disks
>> (e.g., in a VM), you might want to set your session timeout to a long
>> value (e.g., 30 seconds or 60 seconds), but still detect network
>> timeouts quickly. On that thread, Michi proposed using 'ruok' commands
>> from the client to test network connectivity, along with the normal
>> client pings happening in the background to detect server slowness.
>>
>> I was wondering if this would make sense to provide as part of the
>> Curator Framework or Client. There could be some background thread
>> sending 'ruok' commands to whatever server the client is connected to,
>> and going into SUSPENDED (or LOST?) mode when it hits a timeout or gets
>> a failure back. We might be able to implement something like that here
>> and contribute it back, if it sounds interesting to other people and we
>> can agree on a design. Any thoughts?
>>
>> Jeremy


Mime
View raw message