curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stig Rohde Døssing (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CURATOR-106) Issuing a guaranteed delete can cause stack overflow if ZK is not reachable
Date Fri, 21 Apr 2017 17:42:04 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979112#comment-15979112
] 

Stig Rohde Døssing commented on CURATOR-106:
--------------------------------------------

This is still an issue in 2.12.0, and looking at the code also seems to be a problem in 3.x.
It occurs consistently for me in a test where I've disabled retries and are trying to release
an InterProcessMutex while there is no connection to Zookeeper (probable a contrived example,
but I suspect the same thing would happen with any guaranteed delete). It looks to me like
the problem is that CuratorFrameworkImpl.processBackgroundOperation actually does the operation
in the foreground on the first attempt https://github.com/apache/curator/blob/master/curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java#L505.
Maybe this can be fixed by making processBackgroundOperation enqueue the operation instead
of trying to perform it, and then let the background operations loop pick up the operation
instead?

{code}
at org.apache.curator.framework.imps.DeleteBuilderImpl$4.retriesExhausted(DeleteBuilderImpl.java:217)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:716)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:507)
at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:221)
at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35)
at org.apache.curator.framework.imps.FailedDeleteManager.addFailedDelete(FailedDeleteManager.java:55)
at org.apache.curator.framework.imps.DeleteBuilderImpl$4.retriesExhausted(DeleteBuilderImpl.java:217)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:716)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:507)
at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:221)
at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35)
at org.apache.curator.framework.imps.FailedDeleteManager.addFailedDelete(FailedDeleteManager.java:55)
etc.
{code}

> Issuing a guaranteed delete can cause stack overflow if ZK is not reachable
> ---------------------------------------------------------------------------
>
>                 Key: CURATOR-106
>                 URL: https://issues.apache.org/jira/browse/CURATOR-106
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 2.4.2
>            Reporter: Jasdeep Hundal
>             Fix For: awaiting-response
>
>
> For guaranteed deletes (eg. lock releases) that fail, the FailedDeleteManager issues
another guaranteed delete here:
> https://github.com/apache/curator/blob/master/curator-framework/src/main/java/org/apache/curator/framework/imps/FailedDeleteManager.java#L35
> In an environment where ZK has the potential to be down for an extended period of time,
this has the potential to recurse until there is a stack overflow (particularly if the application
is using multiple locks.)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message