curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randgalt <>
Subject [GitHub] curator pull request: [CURATOR-247] Extend Curator's connection st...
Date Sun, 23 Aug 2015 16:06:43 GMT
GitHub user Randgalt opened a pull request:

    [CURATOR-247] Extend Curator's connection state to support SESSION_LOST

    This is a significant change. Please review carefully and let's have a lot of discussion.
Because the new behavior is so much better (and more consistent with expectations) <strong>I've
enabled it by default</strong>.
    Major differences from the older behavior are:
    * Session/connection timeouts are no longer managed by the low-level client. They are
managed by the CuratorFramework instance. There should be no noticeable differences.
    * Prior to 3.0.0, each iteration of the retry policy would allow the connection timeout
to elapse if the connection hadn't yet succeeded. This meant that the true connection timeout
was the configured value times the maximum retries in the retry policy. This longstanding
issue has been address. Now, the connection timeout can elapse only once for a single API
    * MOST IMPORTANTLY! Prior to 3.0.0, ConnectionState.LOST did not imply a lost session
(much to the confusion of users). Now, Curator will set the LOST state only when it believes
that the ZooKeeper session has expired. ZooKeeper connections have a session. When the session
expires, clients must take appropriate action. In Curator, this is complicated by the fact
that Curator internally manages the ZooKeeper connection. Now, Curator will set the LOST state
when any of the following occurs: a) ZooKeeper returns a Watcher.Event.KeeperState.Expired
or KeeperException.Code.SESSIONEXPIRED; b) Curator closes the internally managed ZooKeeper
instance; c) The configured session timeout elapses during a network partition.
    Something important to consider. Given the significance of this change it makes to have
it be part of 3.0.0 but if we merge it into 3.0.0 now it will be harder to maintain master
and 3.0.0 as separate branches. Some git expertise is needed here on how to manage this.

You can merge this pull request into a Git repository by running:

    $ git pull CURATOR-247

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #97
commit 344634ac6e34e61bc0cc7b41923a1df4089c7948
Author: randgalt <>
Date:   2015-08-21T17:10:24Z

    First pass at new (optional) definition of state LOST

commit 2343daf29388566b0efa0b0a2ad21574fb534a27
Author: randgalt <>
Date:   2015-08-21T20:11:59Z

    Merge branch 'CURATOR-3.0' into CURATOR-247

commit 62f3c33cdb556eccf6fe1cc87ee74b3458431777
Author: randgalt <>
Date:   2015-08-21T22:35:44Z

    Continued work on new LOST behavior. Added some tests. To get correct behavior it's necessary
to not retry connection failures. Retrying connection failures was never a good idea and here's
a good opportunity to fix it as this requires client action to enable

commit c5a49216cc78b052066661a8ded357e50e0b6313
Author: randgalt <>
Date:   2015-08-21T22:37:15Z


commit d3170099757c7e17ff8fbee0c37d620aacb60d65
Author: randgalt <>
Date:   2015-08-21T22:49:55Z

    more tests

commit b8d4c3d77de029917820634fa4ed21be19bbcf2c
Author: randgalt <>
Date:   2015-08-21T22:59:07Z

    minor reformat

commit 847cc0d2415f59c2943d4a2734564119ffb38bb1
Author: randgalt <>
Date:   2015-08-22T15:47:01Z


commit ec2f9bd555d01b324bd5ef690b1036d98e1f3702
Author: randgalt <>
Date:   2015-08-22T16:06:33Z

    Fixed testRetry() for new LOST behavior

commit 6381ccb6536f4710248a50ae5d0313399bbfe858
Author: randgalt <>
Date:   2015-08-22T22:50:09Z

    removed some test code

commit e239137019608f02cabb23c27ab13adcef88c027
Author: randgalt <>
Date:   2015-08-23T00:06:55Z

    major refactoring. Abstracting old/new behavior into a pluggable ConnectionHandlingPolicy.
Also, IMPORTANT, made the new behavior the default. This needs to be discussed but it's a
major improvement and we should default to it.

commit 30bd7b655d201762d8ff74062964621879ac7134
Author: randgalt <>
Date:   2015-08-23T00:29:36Z

    further refactoring. Abstracted old framework-level connection handling into ClassicInternalConnectionHandler.
Probably more to do here

commit 23554479597d654fa8318cdc579fc3cc29bc2c54
Author: randgalt <>
Date:   2015-08-23T01:10:34Z

    Curator has a big problem with thread interrupted states getting cleared. There are several
issues on this (CURATOR-208, CURATOR-205, CURATOR-228, CURATOR-109

commit 05d241da642c6ba0d16b3ce97557128fad4dfe41
Author: randgalt <>
Date:   2015-08-23T01:32:41Z

    When the connection timeout elapses and there is more than one server in the connection
string, reset the connection and try again

commit face4034e9fdcc9ffdb394c7c1682add834a1e10
Author: randgalt <>
Date:   2015-08-23T02:54:24Z

    Longer connection timeout needed

commit 5f094f8bb6dca3c056051cb8800b418839cca0e1
Author: randgalt <>
Date:   2015-08-23T12:49:17Z

    More refinement of classic/new connection handling. Reworked how the retry policy is invoked
for each. New behavior is now confirmed to be: wait for connection timeout only once. Some
tests will need work due to this

commit e001e0098f64baa8e0b3b887507bc98972c775dc
Author: randgalt <>
Date:   2015-08-23T14:33:46Z

    more work on repairing tests for new connection handling

commit 1a2a94b625e7e1b5e535414e397e9b3a4173ca1b
Author: randgalt <>
Date:   2015-08-23T15:54:29Z

    more work on repairing tests for new connection handling

commit 64d966c18b9d18c40e13fda98e52d9253b281086
Author: randgalt <>
Date:   2015-08-23T15:57:48Z


commit 9c7cf5d8ba495bccdea2bcb6b377e95f5f99d521
Author: randgalt <>
Date:   2015-08-23T16:02:19Z



If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

View raw message