hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Thompson <br...@systap.com>
Subject RE: Guaranteed message delivery until session timeout?
Date Wed, 30 Jun 2010 23:48:54 GMT
Ted,

Yes, that is clear.  I was looking for this:

> On some failures (communication errors, timeouts, etc) the client will not know if the
update has applied or not. We take steps to minimize the failures, but the only guarantee
is only present with successful return codes.

With regard to timeliness:

> The clients view of the system is guaranteed to be up-to-date within a certain time bound.
(On the order of tens of seconds.) Either system changes will be seen by a client within this
bound, or the client will detect a service outage.

This seems to imply that there are retries for transient communication failures.  Is that
true?

For example, if a client registers a watch, and a state change which would trigger that watch
occurs _after_ the client has successfuly registered the watch with the zookeeper quorum,
is it possible that the client would not observe the watch trigger due to communication failure,
etc., even while the clients session remains valid?  It sounds like the answer is "no" per
the timeliness guarantee.  Is that correct?

Thanks,
Bryan

________________________________
From: Ted Dunning [mailto:ted.dunning@gmail.com]
Sent: Wednesday, June 30, 2010 7:38 PM
To: Patrick Hunt
Cc: zookeeper-user@hadoop.apache.org; Bryan Thompson
Subject: Re: Guaranteed message delivery until session timeout?

Also this:

Once an update has been applied, it will persist from that time forward until a client overwrites
the update. This guarantee has two corollaries:

If a client gets a successful return code, the update will have been applied. On some failures
(communication errors, timeouts, etc) the client will not know if the update has applied or
not. We take steps to minimize the failures, but the only guarantee is only present with successful
return codes. (This is called the monotonicity condition in Paxos.)
Any updates that are seen by the client, through a read request or successful update, will
never be rolled back when recovering from server failures.

I think that the clear implications here are:

a) if you get a successful return code and no session expiration, your ephemeral file is there

b) if the ephemeral files is created, you might not get the successful return code (due to
connection loss), but the ephemeral file might continue to exist (because connection loss
!= session loss)

c) if you get a failure return code, your ephemeral file was not created

On Wed, Jun 30, 2010 at 4:33 PM, Patrick Hunt <phunt@apache.org<mailto:phunt@apache.org>>
wrote:
in particular see "timeliness" http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkGuarantees


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message