curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "J D (JIRA)" <>
Subject [jira] [Commented] (CURATOR-233) Bug in double barrier
Date Wed, 26 Aug 2015 15:05:45 GMT


J D commented on CURATOR-233:

Hi Mike,

I can confirm that the code works for 2 nodes.

However, I think the two lines marked below should be in the else statement.

            String watchPath; // Watch somebody else that still exists
            if ( ourIndex == 0 )
                watchPath = ZKPaths.makePath(barrierPath, children.get(children.size() - 1));
                watchPath = ZKPaths.makePath(barrierPath, children.get(0));
                checkDeleteOurPath(ourNodeShouldExist); //here
                ourNodeShouldExist = false; //here

            Stat stat = client.checkExists().usingWatcher(watcher).forPath(watchPath);

            checkDeleteOurPath(ourNodeShouldExist); //not here
            ourNodeShouldExist = false; //not here

As you guessed correctly, the fix changes the behavior for 3+ nodes. The reason is that a
shortcut for the exit barrier was used which is not compatible to client 0 leaving prematurely

Client 0 watches any other node (and leaves the barrier if only he is left)
All other clients watch client 0 (and leave the barrier if client 0 has left)
Thus, if any other client than client 0 leaves after maxWaitMs, nothing happens and all remaining
clients keep waiting
But if client 0 leaves after maxWaitMs all other nodes leave together with client 0 (even
if they do not have a maxWaitMs time limit)

Best regards,


> Bug in double barrier
> ---------------------
>                 Key: CURATOR-233
>                 URL:
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 2.8.0
>            Reporter: J D
>            Assignee: Mike Drob
>             Fix For: 2.9.0
>         Attachments:,
> Hi,
> I think I discovered a bug in the internalLeave method of the double barrier implementation.
> When a client is told to leave the barrier after maxWait it does not do so. A flag is
set but the client does not leave the barrier, instead it keeps iterating through the control
loop and drives CPU usage to 100%.
> I have attached an example.
> Best regards
> Lianro

This message was sent by Atlassian JIRA

View raw message