curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Rankin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-439) CuratorFrameworkState STARTED, but ZookeeperClient not connected
Date Mon, 23 Apr 2018 09:32:00 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447840#comment-16447840
] 

Alex Rankin commented on CURATOR-439:
-------------------------------------

Thanks [~randgalt] - I think the confusion just comes from the lack of good examples or explanation
of the behaviour of Curator in different scenarios. We did have a {{ConnectionStateListener}},
but the following line in the documentation made us think there was more we should be doing:
{quote}Clients can monitor these changes and take appropriate action.
{quote}
Looking at other libraries ([like this|[https://github.com/mitdbg/amoeba/blob/master/src/main/java/core/utils/CuratorUtils.java#L66] ),
people seemed to be checking that the ZK Client was connected - so we thought that was a
good practice. 

If I understand correctly, the following should be true:
 # {{ConnectionStateListener}} does not need to do anything - it can be used purely to log
changes in the state of Curator, but no further action is needed. {{LOST}} or {{SUSPENDED}}
connections should automatically {{RECONNECT}} when the network is back up.
 # I should not check {{getZookeeperClient().isConnected()}} before any action - just perform
the action, and if the client isn't connected, it will connect (if possible).

If I've got this right, then I'll make sure to close this ticket as "Not an Issue".

> CuratorFrameworkState STARTED, but ZookeeperClient not connected
> ----------------------------------------------------------------
>
>                 Key: CURATOR-439
>                 URL: https://issues.apache.org/jira/browse/CURATOR-439
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 3.2.1
>            Reporter: Alex Rankin
>            Priority: Major
>
> I recently ran into an issue on some of our nodes caused by network issues between a
service and Zookeeper. I have been unable to recreate them as of yet, but I'm still trying.
> *+Setup+*
> 5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 nodes).
> Network issues caused the services to disconnect from Zookeeper. 
> There's a check in our code to see if the Zookeeper connection is available before sending
a request:
> {quote}public boolean isConnected() \{
>     return curatorFramework.getZookeeperClient().isConnected();
> \}
> {quote}
> After the network issues resolved, we noticed that all calls to Zookeeper from 4 of the
services were still failing (the fifth was fine). Checking the logs, we saw that {{CuratorFramework.getState()}}
was reporting the state as STARTED, but {{curatorFramework.getZookeeperClient().isConnected();}}
was returning false. Restarting the service fixed everything, but I want to obviously avoid
this issue in future.
> *+Problem+*
> I couldn't find any documentation stating whether the {{CuratorZookeeperClient.isConnected()}}
should be used, or if {{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be the better check,
or if these should both be equivalent, and there's a bug that let one be true while the other
was false.
> If my own check is wrong, and I shouldn't be using {{CuratorZookeeperClient.isConnected()}},
then I can easily fix that. I wanted to check the expected behaviour before diving too deep
into this, in case this is normal and I am just using Curator incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message