curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: How to avoid CuratorConnectionLossException on leader loss?
Date Sun, 13 Sep 2015 20:37:29 GMT
Curator will only retry until the connection timeout and/or retry policy gives up. Try increasing
your connection timeout and allow more than 3 retries.

-Jordan



On September 13, 2015 at 11:16:07 AM, Jens Rantil (jens.rantil@tink.se) wrote:

Dear Curator(s),

A couple of days ago we did some maintenance of our Zookeeper ensemble and did a rolling restart
of each node. Restarting the followers worked like a charm. However, restarting leader started
throwing/logging CuratorConnectionLossException exceptions that trickled down to our application
code until a reelection had occured. Example:

https://gist.github.com/JensRantil/309fa1bf17ee2982b8e7

We were hoping that Curator would gracefully retry until a leader had been reelected, but
I'm sure there is something we need to tweak for this to avoid happening again.

Question: To avoid this to happen in the future, should we simply increase our retry policy
to retry longer before giving up?

Additional information:
Zookeeper version 1.4.5
Curator version 2.7.0
We are currently using the following retrying policy: new ExponentialBackoffRetry(1000, 3);
Zookeeper configuration all default except initLimit=60 and syncLimit=30.
Thanks,
Jens

--
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook Linkedin Twitter
Mime
View raw message