zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Bringhurst <jbringhu...@linkedin.com.INVALID>
Subject Re: Zookeeper a good fit?
Date Mon, 18 Aug 2014 20:30:18 GMT
Leader election is definitely a good fit for Zookeeper.

In my previous response, I assumed that you had many tasks which needed to be distributed
to several servers so they might be run in parallel.

As Alvaro said, Curator is a good fit for this:

https://curator.apache.org/curator-recipes/leader-election.html

Also, Kazoo:

http://kazoo.readthedocs.org/en/latest/_modules/kazoo/recipe/election.html

-Jon

On Aug 18, 2014, at 12:04 PM, Alvaro Gareppe <agareppe@gmail.com<mailto:agareppe@gmail.com>>
wrote:

That seams to be a perfect fit for zookeeper.. Node coordination is pretty
easy using a leader election recipe (look at curator if you like or the
zookeeper recipes)


On Mon, Aug 18, 2014 at 3:11 PM, Phil Burress <philburresseme@gmail.com<mailto:philburresseme@gmail.com>>
wrote:

We have 5 or so background jobs that are all long running (start on system
start up and terminate on system shutdown or node removal). Basically I
just want to ensure that only a single node takes ownership of a particular
job, so that multiple nodes aren't attempting to run the same job.


On Mon, Aug 18, 2014 at 11:52 AM, Jon Bringhurst <
jbringhurst@linkedin.com.invalid<mailto:jbringhurst@linkedin.com.invalid>> wrote:

it really depends on the volume of tasks you're dealing with.

If you have a medium volume of background tasks (say, more than one task
every 5 seconds with high growth potential) it might be a good idea to
consider something which was designed to perform as a task queue (things
like http://www.celeryproject.org/, http://gearman.org/, and
https://github.com/resque/resque fit into this category).

If you have a much higher number of tasks (more than a few per second,
possibly millions of tasks per second), something like
https://kafka.apache.org/ (very high volume), or
http://www.rabbitmq.com/
(medium-high volume) might be a good fit for the job.

If you have a very low volume of tasks (a few per minute), you might be
able to get away with a quick queue implementation directly on top of
zookeeper. Take a look at the queue recipes at
https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_recipes_Queues
(also at

http://blog.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/
).
It shouldn't be too much effort to whip up something using
https://kazoo.readthedocs.org/en/latest/.

There's a ton of systems out there to do something like this (I've even
made one myself https://github.com/hpc/libcircle), so there's a good
chance I've missed the one that would be perfect for your use case.
However, the links in this email should give you a decent starting point.

-Jon

On Aug 18, 2014, at 7:41 AM, Phil Burress <philburresseme@gmail.com
<mailto:philburresseme@gmail.com>> wrote:

Currently we have a cluster of machines running a single application. The
cluster performs various background tasks and we have a hacky, home-grown
solution for the nodes in the cluster to coordinate with each other to
perform these background tasks. It's very error-prone and we're looking
to
replace it. Would Zookeeper be a good fit for coordinating something like
this? If so, are there any lightweight examples out there we could look
at
it?

Thanks very much!

Phil






--
Ing. Alvaro Gareppe
agareppe@gmail.com<mailto:agareppe@gmail.com>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message