zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks)
Date Sat, 23 Jan 2010 20:09:01 GMT
This should roughly work.  The one thing that I have seen that would not
work well with this would be processes that run anomalously long.

As such, I would include an expected time of completion as well as process
id in the task ephemeral file.  Then you can run a period cleanup process to
look for tasks that have out-lived their expected span of time.  Any tasks
that have run much longer than expected can be killed.  That should cause
the ephemeral file for that process to vanish and other tasks can bid for
the task.

Of course you will also need a reliable way to signal completion of the task
and you may need some way to indicate what kinds of output were produced and
where these are located.  The deletion of the original task file is a
natural way to signal completion, but you have to be careful about any other
state changes recording the completion and finish those state changes before
deleting the task file.  That way if the process is killed or dies or is
disconnected before completely recording  the result of the task, nobody
will think that the task is done.

On Sat, Jan 23, 2010 at 12:58 AM, Zheng Shao <zshao9@gmail.com> wrote:

> Each node will start 10 processes.
>  Each process will list the directory "/mytasks" with a watcher
>  If trigger by the watcher, we relist the directory.
>  If we found some missing files in the range of "0" to "99", we
> create an EPHEMERAL node with no-overwrite option
>    if the creation is successful, then we disable the watcher and
> start processing the corresponding task (if something goes wrong, just
> kill itself and the node will be gone)
>    if not, we go back to wait for watcher.
> Will this work?

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message