zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Bangert <...@groovie.org>
Subject The State of Python Zookeeper libraries and collaboration
Date Thu, 17 May 2012 00:50:03 GMT
It would seem that about 6 months or ago or so, there wasn't much out
there in terms of higher level Python libs for Zookeeper. There was the
Cloudera article on queues, and txzookeeper (which I'm sure many of us
not using twisted immediately ignored).

In the time since, several people including myself needed solutions
involving Zookeeper with Python and seeing nothing out there all
apparently began writing libraries (judging from the project timelines
in most cases). I've been collaborating with the author of zc.zk (Jim
Fulton) for awhile and we decided it'd make more sense to merge our
efforts. In this spirit I began contacting all the other developers to
gauge their interest and most have been interested.

I created a python-zk organization on GitHub to be the home for this
effort and moved over the zc.zk library (which people apparently had a
hard time locating), along with the fairly widely used staticly compiled
Python Zookeeper binding.


Next up is to create the new merged core which I plan on basing mostly
around the cleanest implementation I have seen so far (which also
happens to be one of the only gevent compatible ones), kazoo. I've
talked with the primary author of Kazoo, and the name may remain with
the new merged package or it may get a new name if that doesn't work.
I'm not terribly tied to names as much as I am to solid, well tested,
well documented working code... but having catchy names does seem to help.

I'm currently working on this full-time, so I expect it to be in a
usable state in a week or so (hopefully not too optimistic). If you're
interested in helping out, the more the better, please feel free to
e-mail me directly or respond here.

This stuff is complex, it needs many eyes on it and lots of code review.

This hopefully explains why I'm so interested in having a single Python
Zookeeper library along similar caliber to Netflix's Curator that has:
- Very thorough unit/integration tests (100% coverage minimum)
- Cleanly handles connection loss
- Works under gevent or threaded/blocking
- Very well documented (API docs and narrative)
- Implements all the Zookeeper recipes
- Service Discovery/Management
- Higher level utility functions for common Zookeeper tasks

In the mean-time, here is a summary of my research efforts and code
review (if something isn't accurate, please feel free to correct).

Please don't take this as a critique, I'm just trying to document what
is out there for my own reference on merging and hopefully so other
people coming along don't continue to replicate this. :)

    - https://github.com/jrydberg/gevent-zookeeper/

    - Works under gevent
    - No tests
    - No documentation

    - https://github.com/nimbusproject/kazoo

    - Resilient Client
    - Basic Lock (Uses UUID properly)
    - Some Tests (Integrated)
    - No documentation (doc strings only)
    - Works under gevent

    - https://github.com/nkvoll/pykeeper

    - Higher level client (not resiliant to errors)
    - Documentation
    - Some tests (Integrated)

    - JuJu Team
    - https://launchpad.net/txzookeeper

    - Resilient Client
    - Doesn't handle create node edge-case
    - Basic Lock (open bug filed to handle the UUID bit)
    - Queue, ReliableQueue, SerializedQueue
    - No documentation (doc strings only)
    - Usable only from twisted
    - Well tested (Integrated)

twitter zookeeper lib

    - Resilient Client
    - Handles create node edge-case
    - Service Registration/Discovery
    - Some documentation
    - Well tested (Integrated)
    - Tied to a lot of twitter commons code

zkpython (improvements to a fork of the official bindings)
    - https://github.com/duncf/zkpython/

    - Resilient Client
    - Basic Lock (Using unique id rather than UUID)
    - Handles create node edge-case
    - Some Tests (Integrated)
    - No additional docs

    - https://github.com/python-zk/zc.zk

    - Non-resilient Client (reconnects must be handled)
    - Higher level automatic watch functionality
    - Service Registration/Discovery
    - Well tested (Unit and Integration tests)
    - Documented (on usage, source code is missing doc strings)

    - https://github.com/mozilla-services/zktools

    - Relies on zc.zk
    - Shared Read/Write Locks
    - AsyncLock
    - Revokable Locks
    - Tests (Integrated)

    - https://github.com/davidmiller/zoop

    - Doesn't handle create node edge-case
    - Doesn't handle retryable exceptions
    - Revokable Lock (Doesn't handle create node edge-case, uses a permanent
                      node instead of ephemeral)
    - Tested (Unit tests via ZK mocks)
    - Well Documented (doc strings and narrative docs)

Ben Bangert
(ben@ || http://) groovie.org

View raw message