zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: thoughts about extension to multi semantics
Date Sat, 17 Aug 2019 21:50:18 GMT
On Sat, Aug 17, 2019 at 10:19 AM Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> Some thoughts:
> It doesn't really help with any of the "standard" recipes as they all need
> to set watches.

I don't understand that. Watches can be set in a multi.

> Not to open a can of worms, but if there were a firehose version of
> watches that could be set independently, this type of multi-op could
> radically simplify some of the recipes. i.e. one could imagine a multi-op
> that creates an ephemeral node and then returns a sorted list of child node
> names so that leader election and locks can be done in one shot.

I don't understand that, either. But this time I just don't understand what
you are suggesting and how it helps.

> An atomic counter could be done much more simply than how Curator does it
> now as the test/increment could be done server side

I don't think so. No arithmetic is included in the current multi.

> Queues would be easier (possibly - I need to think about this some more).
> Curator's queue code is very complex.

I could imagine some simplification. Suppose that our queue is either an
empty directory or it looks like this:

[image: image.png]
(figure also at https://www.dropbox.com/s/qwwn9ahgxqh9iyf/queue.png?dl=0)

The idea is that the master znode is used to coordinate directory updates
and each running or pending task has an ephemeral znode. Whenever the
currently running task finishes or crashes, the corresponding task znode
will disappear and wake up the next pending task. If a task in the middle
of the queue disappears, the next task in line will wake up and should
determine what it should start watching.

Some issues occur when we would like to be sure that we either create the
master znode (because we have an empty queue) or that we read the master to
find out the what last task should be. Multi and first_multi both can help
with this.

repeat {
     one_of {
           create leader znode
           create ephemeral task znode
     } or {
           get leader znode version
           get directory contents

   // if we didn't create new leader node, look at directory and pick a
task node name
   // to create and a task node to watch if that doesn't succeed, something
changed and
   // we repeat
     multi {
           create ephemeral task znode
           write to master znode and verify version of master znode
           put watch on previous znode in queue (if any)
  } until success

When a task finishes or crashes, it can simply delete its own task znode
(or let the znode evaporate on its own). If there is another task pending,
it will be notified. Whenever a task is notified, it should get the master
version and the directory contents and decide who it should watch (if it
isn't the head of the queue) or that it should start work (if it is the
head of the queue). Either way, after making such a decision, it should
verify the master version, write to the master and set a watch using a

> Anyway - I'll try to spend some time in Curator's various recipes to see
> how they would be  simplified if this server-side feature was available.

Very cool. Very interested in hearing more thoughts on this.

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message