zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Barrier Tutorial Possible Deadlock
Date Mon, 09 May 2011 04:10:56 GMT
Justin,

I think you are correct that there is a bug in the recipe and I think that
Xing has the core of the solution.

One solution is for the contents of the root to be an enum with the
following starts:

STOP
RUN

The initial value should be STOP.

The enter method should be changed so that it gets the children list as
before and checks the size.  If the size is large enough, the code should
set the value of the root to be RUN.  If the size is not large enough, the
value of the root should be interrogated and if it is RUN, the program
should continue.  If the size is too small and the value is STOP, then the
code should wait as before.

The leave method should work the same as before except that when the number
of children reaches zero, the state should be set to STOP.

Note that this algorithm is robust to fast processes on the start side.  The
details of how processes that arrive as other processes are leaving the
Barrier is somewhat debatable.  It might be preferable to add a third enum
state (LEAVING) to prevent additional processes from entering as soon as the
first process starts to leave the Barrier.  The first process to start
leaving the Barrier would set the enum to LEAVING and the last process to
finish leaving the Barrier would set it to STOP.  Doing this would require
that processes also watch the root contents as well as children, but avoids
the problem of processes entering the barrier while the rest are leaving the
barrier.

On Sun, May 8, 2011 at 8:13 PM, 邢迪侃 <xingdk.apex@gmail.com> wrote:

> Letting Process 2  leaves something on ZK (create a file, for
> example) before it starts to compute, and letting others watch that thing.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message