zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks)
Date Mon, 25 Jan 2010 21:53:46 GMT
ZK-22 means that if the connection with one server is broken and is
automatically restored with another server then the client will not need to
be notified in most cases.

ZK already restores the connection in this way, but the result can be
difficult to understand and the client cannot always easily do the right
thing.  The worst example that I know of is when the client was creating a
sequentially named file just before the failure.  On reconnect, the client
knows that the transaction could not be completed due to connection loss,
but does not know if the file was created.  If the entire session is torn
down and re-opened there could be large effects as other ephemeral files
disappear.  With ZK-22, this case will be transparent to the client.  Either
the file will be created and the result will be returned to the client, or
the transaction will fail and the session will expire.

Most of the bugs that I have written into zookeeper client code and most of
the bugs that I have seen in other people's ZK client code have had to do
with handling connection loss exceptions that did not result in session
expiration.  Getting rid of those bugs is a huge win in my book.

Remember that there is a huge difference between connection loss (no big
deal after ZK-22 and won't even disturb the client) and session expiration
(this is a big deal and you have to start over with a new session)

On Mon, Jan 25, 2010 at 2:23 AM, Qing Yan <qingyan@gmail.com> wrote:

> About error handling, does ZK-22 means disconnection will be eliminated
> from
> API and will be solely handled by ZK implementation?
> I am not sure it is such a good idea though. Application layer need to be
> notified that communication with ZK has been broken - things may out of
> sync
> - and enter the safe mode accordingly.I would image in some cases current
> "fail fast" behavior might be desirable.

This won't be necessary.  ZK will guarantee consistency.

>  Another layer on top of ZK API (with overridable behavior, like the
> ProtocolSupport stuff) seems strike a balance here...

I think not.  ZK-22 really is a huge improvement in all respects.

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message