curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jordan Zimmerman (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CURATOR-64) Retry logic appears to delay reconnect after session expiry
Date Thu, 10 Oct 2013 05:17:42 GMT

     [ https://issues.apache.org/jira/browse/CURATOR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jordan Zimmerman resolved CURATOR-64.
-------------------------------------

    Resolution: Not A Problem

> the first getData() fails because of session expiry (should be nearly instantly) 
This is an incorrect assumption. Curator manages the ZooKeeper connection for you. When there
is a session expiration, Curator will try to re-establish a connection.

Also, you should be aware that ZooKeeper watches are single threaded. There is a Curator Tech
Note on this (though the issue is with ZooKeeper and not Curator): https://cwiki.apache.org/confluence/display/CURATOR/TN1

If you want to sidestep Curator's ZooKeeper handle management you can get the raw ZooKeeper
handle and use that, but I don't recommend it.

If this doesn't answer your question and you still feel Curator should do something different,
let me know and I'll re-open the issue.

> Retry logic appears to delay reconnect after session expiry
> -----------------------------------------------------------
>
>                 Key: CURATOR-64
>                 URL: https://issues.apache.org/jira/browse/CURATOR-64
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>            Reporter: Shaun Senecal
>         Attachments: SessionExpiryTest.java
>
>
> If a watch is triggered immediately before a session expiry, and the watch attempts to
fetch data from ZK (using Curator), its possible that the reconnect behaviour is delayed until
the retry gives up
> It currently looks something like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the getData() will retry until the policy tells it to give up (could be several minutes)
> 5. finally curator will reconnect to ZK
> I would expect something more like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the first getData() fails because of session expiry (should be nearly instantly)
> 5. curator reconnects to ZK
> 6. a second attempt to call getData() is made via the RetryPolicy
> 7. watch A completes processing
> We are using the BoundedExponentialBackoffRetry, so we end up waiting for quite a while
after session expiry, leaving our services dead in the water for much longer than is necessary.
> This occurs with curator v1.3.3 and ZK 3.4.5



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message