curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: ConnectionState.LOST without retry.
Date Fri, 25 Mar 2016 16:51:57 GMT
I’d consider that a bug then. Please open an issue in Jira.

-Jordan

> On Mar 25, 2016, at 11:50 AM, Purshotam Shah <purushah@yahoo-inc.com> wrote:
> 
> But in a cause of long GC pause, it doesn't.
> This is what we have figure out from our test. If ZK is down, it does retry based on
retry policy. But in case of long GC pause, it doesn't. If GC pause > session timeout,
then curator notifies connection lost without retrying.
> 
> I was thinking that it will be better if we can retry even for GC pause also.
> 
> Thanks,
> 
> 
> 
> 
> On Friday, March 25, 2016 9:44 AM, Jordan Zimmerman <jordan@jordanzimmerman.com>
wrote:
> 
> 
> Curator does retry when the connection is lost, based on the retry policy. ConnectionState.LOST
implies that the retry policy gave up.
> 
> -Jordan
> 
>> On Mar 25, 2016, at 11:33 AM, Purshotam Shah <purushah@yahoo-inc.com <mailto:purushah@yahoo-inc.com>>
wrote:
>> 
>> Thanks for the information. Doesn't it make sense to retry once curator receives
connection lost from ZK client? We have seen it doing if ZK is down, curator tries with retry
policy before notifying as connection lost.
>> 
>> Thanks,
>> 
>> 
>> 
>> On Thursday, March 24, 2016 1:52 PM, Jordan Zimmerman <jordan@jordanzimmerman.com
<mailto:jordan@jordanzimmerman.com>> wrote:
>> 
>> 
>> The ZooKeeper client (which Curator uses) sends Heartbeats to the connected server.
The heartbeat is sent every 2/3 of a session. If the hearbeat fails, the connection drops.
Please read Tech Note 10 for detais: https://cwiki.apache.org/confluence/display/CURATOR/TN10
<https://cwiki.apache.org/confluence/display/CURATOR/TN10>
>> 
>> -Jordan
>> 
>>> On Mar 24, 2016, at 12:30 PM, Purshotam Shah <purushah@yahoo-inc.com <mailto:purushah@yahoo-inc.com>>
wrote:
>>> 
>>> 
>>> We use apache curator to connect to ZK.
>>> We create curator client with following settings.
>>> 1. session timeout = 5 min
>>> 2. connection time = 3 min
>>> 3. Retry = ExponentialBackoffRetry(1000, 10)
>>> 
>>> We have also setup ConnectionStateListener. We use curator mostly for distributed
locking. We shutdown the system when there is a connection lost.
>>> 
>>> We noticed that if there is long GC pause, we get notified as ConnectionState.LOST
and this is causing our system to go down.
>>> 
>>> We are working on to figure out why there is log GC pause. 
>>> My question even if we have long GC pause > session timeout, doesn't curator
use Retrypolicy to retry before notifying as ConnectionState.LOST
>>> 
>>> Thanks,
>>> 
>> 
>> 
>> 
> 
> 
> 


Mime
View raw message