curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaun Senecal (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (CURATOR-64) Retry logic appears to delay reconnect after session expiry
Date Thu, 10 Oct 2013 05:41:41 GMT

     [ https://issues.apache.org/jira/browse/CURATOR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shaun Senecal updated CURATOR-64:
---------------------------------

    Comment: was deleted

(was: I'm still confused.

The behaviour we are seeing is that Curator is hanging for several minutes, logging exceptions
about failed retry attempts all along the way, before being able to reconnect.  Are you saying
this is the expected behaviour?

I understand that Curator is managing the connection for me, which is why I assume that the
retry logic should be able to run in parallel with the reconnect logic so that our service
spends as little time as possible disconnected from the cluster.  Am I still missing something?

)

> Retry logic appears to delay reconnect after session expiry
> -----------------------------------------------------------
>
>                 Key: CURATOR-64
>                 URL: https://issues.apache.org/jira/browse/CURATOR-64
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>            Reporter: Shaun Senecal
>         Attachments: SessionExpiryTest.java
>
>
> If a watch is triggered immediately before a session expiry, and the watch attempts to
fetch data from ZK (using Curator), its possible that the reconnect behaviour is delayed until
the retry gives up
> It currently looks something like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the getData() will retry until the policy tells it to give up (could be several minutes)
> 5. finally curator will reconnect to ZK
> I would expect something more like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the first getData() fails because of session expiry (should be nearly instantly)
> 5. curator reconnects to ZK
> 6. a second attempt to call getData() is made via the RetryPolicy
> 7. watch A completes processing
> We are using the BoundedExponentialBackoffRetry, so we end up waiting for quite a while
after session expiry, leaving our services dead in the water for much longer than is necessary.
> This occurs with curator v1.3.3 and ZK 3.4.5



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message