curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Jaton <bja...@radiantlogic.com>
Subject Curator connection states
Date Thu, 15 Jan 2015 00:28:12 GMT
Hello,

I am running some simple tests around the connection state listener
behavior.
I use a regular 3 nodes ensemble, 1 of them being down, I start/stop a
second one to trigger an outage of the ensemble.

I use:
- connection timeout : 18 seconds
- session timeout : 72 seconds
- retry interval : 5 seconds

Case 0: there is no retry:
- the switch SUSPENDED -> LOST takes less than a second
- the background retry goes on for 18 seconds

Case 1: there is 1 retry:
- the switch SUSPENDED -> LOST takes 7 seconds
- the background retry goes on for 41 seconds

Case 2: there is 2 retries:
- the switch SUSPENDED -> LOST takes 12 seconds
- the background retry goes on for 64 seconds

I expected to see the same numbers, i.e. I thought that we received a LOST
event when Curator gave up trying.

But apparently the duration of the background retries is this:
*connectionTimeout * nbRetries + retryInterval * max(0, nbRetries-1)*

Why is it linked to the connectionTimeout since the connection fails before
that (case 0, 1 and 2 all go into LOST state in less than 18 seconds)

According to http://curator.apache.org/errors.html , LOST means that "the
connection is confirmed to be lost."
So a LOST state is when I lose my ephemeral nodes (for example).
Is that correct?

Then I am wondering why it would be different whether we have 0, 1 or 2
retries?

Thanks for your insights,
Benjamin

Mime
View raw message