hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "ZooKeeper/FAQ" by BenjaminReed
Date Fri, 09 Jan 2009 13:02:43 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by BenjaminReed:

  1) What are the state transitions of ZooKeeper?
+ attachment:state_dia.png
  2) How should I handle the CONNECTION_LOSS error?
+ CONNECTION_LOSS means the link between the client and server was broken. It doesn't necessarily
mean that the request failed. If you are doing a create request and the link was broken after
the request reached the server and before the response was returned, the create request will
succeed. If the link was broken before the packet went onto the wire, the create request failed.
Unfortunately, there is no way for the client library to know, so it returns CONNECTION_LOSS.
The programmer must figure out if the request succeeded or needs to be retried. Usually this
is done in an application specific way. Examples of success detection include checking for
the presence of a file to be created or checking the value of a znode to be modified.
  3) How should I handle SESSION_EXPIRED?
+ SESSION_EXPIRED automatically closes the ZooKeeper handle. In a correctly operating cluster,
you should never see SESSION_EXPIRED. It means that the client was partitioned off from the
ZooKeeper service for more the the session timeout and ZooKeeper decided that the client died.
Because the ZooKeeper service is ground truth, the client should consider itself dead and
go into recovery. If the client is only reading state from ZooKeeper, recovery means just
reconnecting. In more complex applications, recovery means recreating ephemeral nodes, vying
for leadership roles, and reconstructing published state.
+ Library writers should be conscious of the severity of the expired state and not try to
recover from it. Instead libraries should return a fatal error. Even if the library is simply
reading from ZooKeeper, the user of the library may also be doing other things with ZooKeeper
that requires more complex recovery.
  4) Is there an easy way to expire a session for testing?
+ Yes, a ZooKeeper handle can take a session id and password. This constructor is used to
recover a session after total application failure. For example, an application can connect
to ZooKeeper, save the session id and password to a file, terminate, restart, read the session
id and password, and reconnect to ZooKeeper without loosing the session and the corresponding
ephemeral nodes. It is up to the programmer to ensure that the session id and password isn't
passed around to multiple instances of an application, otherwise problems can result.
+ In the case of testing we want to cause a problem, so to explicitly expire a session an
application connects to ZooKeeper, saves the session id and password, creates another ZooKeeper
handle with that id and password, and then closes the new handle. Since both handles reference
the same session, the close on second handle will invalidate the session causing a SESSION_EXPIRED
on the first handle.

View raw message