curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "J D (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-233) Bug in double barrier
Date Wed, 26 Aug 2015 15:05:45 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713621#comment-14713621
] 

J D commented on CURATOR-233:
-----------------------------

Hi Mike,

I can confirm that the code works for 2 nodes.

However, I think the two lines marked below should be in the else statement.

{code:title=DistributedDoubleBarrier.java|borderStyle=solid}
            String watchPath; // Watch somebody else that still exists
            if ( ourIndex == 0 )
            {
                watchPath = ZKPaths.makePath(barrierPath, children.get(children.size() - 1));
            }
            else
            {
                watchPath = ZKPaths.makePath(barrierPath, children.get(0));
                checkDeleteOurPath(ourNodeShouldExist); //here
                ourNodeShouldExist = false; //here
            }

            Stat stat = client.checkExists().usingWatcher(watcher).forPath(watchPath);

            checkDeleteOurPath(ourNodeShouldExist); //not here
            ourNodeShouldExist = false; //not here
{code}


As you guessed correctly, the fix changes the behavior for 3+ nodes. The reason is that a
shortcut for the exit barrier was used which is not compatible to client 0 leaving prematurely
(http://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_doubleBarriers).

Client 0 watches any other node (and leaves the barrier if only he is left)
All other clients watch client 0 (and leave the barrier if client 0 has left)
Thus, if any other client than client 0 leaves after maxWaitMs, nothing happens and all remaining
clients keep waiting
But if client 0 leaves after maxWaitMs all other nodes leave together with client 0 (even
if they do not have a maxWaitMs time limit)


Best regards,

J D

> Bug in double barrier
> ---------------------
>
>                 Key: CURATOR-233
>                 URL: https://issues.apache.org/jira/browse/CURATOR-233
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 2.8.0
>            Reporter: J D
>            Assignee: Mike Drob
>             Fix For: 2.9.0
>
>         Attachments: DoubleBarrierClient.java, DoubleBarrierTester.java
>
>
> Hi,
> I think I discovered a bug in the internalLeave method of the double barrier implementation.
> When a client is told to leave the barrier after maxWait it does not do so. A flag is
set but the client does not leave the barrier, instead it keeps iterating through the control
loop and drives CPU usage to 100%.
> I have attached an example.
> Best regards
> Lianro



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message