curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaun Senecal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-64) Retry logic appears to delay reconnect after session expiry
Date Thu, 10 Oct 2013 05:41:42 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791219#comment-13791219
] 

Shaun Senecal commented on CURATOR-64:
--------------------------------------

After moving all of my notification processing to a background thread, the problem went away.
 Thanks for pointing that out.  I guess the issue is that we get disconnected, but because
the watches are still running it prevents the reconnect from executing properly.

> Retry logic appears to delay reconnect after session expiry
> -----------------------------------------------------------
>
>                 Key: CURATOR-64
>                 URL: https://issues.apache.org/jira/browse/CURATOR-64
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>            Reporter: Shaun Senecal
>         Attachments: SessionExpiryTest.java
>
>
> If a watch is triggered immediately before a session expiry, and the watch attempts to
fetch data from ZK (using Curator), its possible that the reconnect behaviour is delayed until
the retry gives up
> It currently looks something like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the getData() will retry until the policy tells it to give up (could be several minutes)
> 5. finally curator will reconnect to ZK
> I would expect something more like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the first getData() fails because of session expiry (should be nearly instantly)
> 5. curator reconnects to ZK
> 6. a second attempt to call getData() is made via the RetryPolicy
> 7. watch A completes processing
> We are using the BoundedExponentialBackoffRetry, so we end up waiting for quite a while
after session expiry, leaving our services dead in the water for much longer than is necessary.
> This occurs with curator v1.3.3 and ZK 3.4.5



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message