curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <>
Subject Re: DistributedDoubleBarrier semantics
Date Tue, 25 Aug 2015 16:04:37 GMT
Unfortunately, the recipes don’t list edge cases. The algorithm is described here:

As I recall, when I wrote it I just followed the recipe as listed. I think some others have
contributed to the code as well. I never really understood what the use-case for this recipe
is - I’ve never needed it.

Let’s add time to your scenario (where EB is entered/blocked, C is crashed, W is working
and * is not active):

     C1     C2     C3   C4
T1   EB      *      *    *
T2   EB     EB      *    *
T3   EB      C      *    *
T4   EB      *     EB    *
It seems to me that at T4, C1 cannot reliably know that C2 ever entered. It may have gotten
a notification or not depending on timing. So, the state of the barrier is unknown, right?
If C1 gets notified prior to C2 crashing, though, the barrier should proceed by definition.
So, I guess the real question is how should the barrier handle errors. i.e., at T2 it has
enough members to continue but after that one of the members leaves prematurely. We may have
to allow users to specify the behavior. If users can handle members crashing, then it works

     C1     C2     C3   C4
T1   EB      *      *    *
T2   EB     EB      *    *
T3   EB      C      *    *
T4   EB      *     EB    *
T5   Barrier can start as there enough members
T6   W       *      C    *   this is OK because we had 2 enter at T5
T7   Here C1 blocks and times out on leave

On August 25, 2015 at 10:15:55 AM, Mike Drob ( wrote:


I was working on CURATOR-233 and I realized that I don't really understand  
the semantics of the DistributedDoubleBarrier when there are more clients  
than the given member quantity. Note that I'm not asking about what the  
code currently does (because I suspect it has a few inconsistencies) but  
what it should do according to the API contract.  

Let's suppose I create a DDB with n=2.  

client1.enter() // blocks until at least 2 clients enter  
client2.enter() // returns immediately  

client3.enter() // returns immediately?  

client3.leave() // blocks until all clients leave?  
client2.leave() // blocks until all clients leave?  

client4.enter() // imagine a straggling thread, no idea what should even  
happen here  


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message