curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-157) Avoid stack traces closing PathChildrenCache followed by closing CuratorFramework
Date Thu, 30 Oct 2014 17:37:33 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190442#comment-14190442
] 

ASF GitHub Bot commented on CURATOR-157:
----------------------------------------

Github user bdumon commented on a diff in the pull request:

    https://github.com/apache/curator/pull/51#discussion_r19621734
  
    --- Diff: curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/PathChildrenCache.java
---
    @@ -382,6 +383,11 @@ public void close() throws IOException
                 connectionStateListener = null;
                 childrenWatcher = null;
                 dataWatcher = null;
    +
    +            synchronized (backgroundTaskMonitor) {
    --- End diff --
    
    No problem, there might a problem in my reasoning, which is as follows: a refresh might
be going on when PathChildrenCache.close() is called. The refresh runs on a background thread,
the ZK event thread if I'm not mistaken. Since the processResult will be running within a
synchronized block, the PathChildrenCache.close() method will not be able to return until
the processResult is finished, since it tries to acquire the same lock.
    
    The second synchronized block in refresh(), around the statements "if-closed-return, getChildren()..."
is for similar reasons: the if-closed in that block might be called before the closed state
is set, while the getChildren might be launched afterwards, but by then zookeeper might already
be closed as well. By putting it in a synchronized block, we are sure no ZooKeeper operation
is going on after returning from the close() method, and CuratorFramework/ZooKeeper can be
closed without causing errors to be logged.


> Avoid stack traces closing PathChildrenCache followed by closing CuratorFramework
> ---------------------------------------------------------------------------------
>
>                 Key: CURATOR-157
>                 URL: https://issues.apache.org/jira/browse/CURATOR-157
>             Project: Apache Curator
>          Issue Type: Improvement
>            Reporter: Bruno Dumon
>            Assignee: Jordan Zimmerman
>         Attachments: LogProblemIllustration.java
>
>
> When closing PathChildrenCache, and immediately afterwards closing CuratorFramework,
some ERROR-level stack traces are logged.
> This was previously reported on the mailing list: http://curator.markmail.org/thread/bmfr62ekx5p2vv7f
> The cause is that the BackgroundCallback defined in PathChildrenCache.refresh() will,
when triggered, perform some more ZooKeeper operations.
> Thus one can get in sequences such as:
>  * operation with BackgroundCallback is submitted
>  * processResult of the BackgroundCallback is called
>  * PathChildrenCache is closed
>  * CuratorFramework is closed
>  * processResult, which is running on another thread, comes to the point it does operations
on ZooKeeper, which fail because ZooKeeper is closed.
> There is no real impact on the application, it is just for log-esthetical reasons that
I'd like to avoid it.
> In the more common case, the processResult will receive an IllegalStateException, which
could be easily catched and ignored in PathChildrenCache if the PathChildrenCache is closed:
> {noformat}
> 14/10/30 11:24:51 ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl: Background
exception was not retry-able or retry gave up
> java.lang.IllegalStateException: instance must be started before calling this method
> 	at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.getData(CuratorFrameworkImpl.java:360)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.getDataAndStat(PathChildrenCache.java:545)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:668)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.access$200(PathChildrenCache.java:68)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:490)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:715)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:502)
> 	at org.apache.curator.framework.imps.GetChildrenBuilderImpl$2.processResult(GetChildrenBuilderImpl.java:166)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:590)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> But sometimes it also fails with other async operations deeper down:
> {noformat}
> 14/10/30 11:24:51 ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl: Background
exception was not retry-able or retry gave up
> java.lang.IllegalStateException: Client is not started
> 	at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> 	at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:474)
> 	at org.apache.curator.framework.imps.GetDataBuilderImpl.performBackgroundOperation(GetDataBuilderImpl.java:263)
> 	at org.apache.curator.framework.imps.OperationAndData.callPerformBackgroundOperation(OperationAndData.java:65)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:789)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:487)
> 	at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:275)
> 	at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.getDataAndStat(PathChildrenCache.java:545)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:668)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache.access$200(PathChildrenCache.java:68)
> 	at org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:490)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:715)
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:502)
> 	at org.apache.curator.framework.imps.GetChildrenBuilderImpl$2.processResult(GetChildrenBuilderImpl.java:166)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:590)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> Therefore I have created a patch where PathChildrenCache.close() will wait until the
possibly running BackgroundCallback is finished.
> I will also attach a small class that illustrates the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message