hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "ZooKeeper/GSoCReadOnlyMode" by SergeyDoroshenko
Date Mon, 05 Jul 2010 17:13:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "ZooKeeper/GSoCReadOnlyMode" page has been changed by SergeyDoroshenko.
http://wiki.apache.org/hadoop/ZooKeeper/GSoCReadOnlyMode?action=diff&rev1=6&rev2=7

--------------------------------------------------

  This project will implement a 'read-only' mode for ZooKeeper servers that allows read requests
to be served as long as the client can contact a server
  
  == Detailed description ==
- === Client-side changes ===
+ === Client-side ===
   * '''API:'''<<BR>>To enable read-only functionality, user should pass optional
boolean parameter to the ZooKeeper's constructor. When a client with r-o mode enabled is connected
to a server-in-majority, it behaves as a normal one. But if server is partitioned,  read requests
issued by such client are allowed, while write requests  fail with exception.<<BR>>If
r-o mode is disabled for a client it won't connect to any server if there's no quorum.
  
   * '''Session handling:'''<<BR>>
-   * '''Session states. '''New session  state is introduced, CONNECTEDREADONLY. Client will
move to this state  when it's connected to a partitioned server (which automatically implies
 that only clients with r-o mode enabled can be in this state). If we're in this state we
can issue only read requests. From this state session could transition to CONNECTED -- if
client's reconnected to r/w server -- and to all states reachable from CONNECTED state.
+   * '''Session states.''' New session  state is introduced, CONNECTEDREADONLY. Client will
move to this state  when it's connected to a partitioned server (which automatically implies
 that only clients with r-o mode enabled can be in this state). If we're in this state we
can issue only read requests. From this state session could transition to CONNECTED -- if
client's reconnected to r/w server -- and to all states reachable from CONNECTED state.
-   * '''Session events. '''How will application know mode's changed? Default watcher of r-o
client will inform application about mode change from usual to read-only and vice versa, besides
current notifications like connection loss.
+   * '''Session events.''' How will application know mode's changed? Default watcher of r-o
client will inform application about mode change from usual to read-only and vice versa, besides
current notifications like connection loss.
-   *
-  * Make server distinguish these two types of clients: add "am-i-readonly-client" field
to a packet client sends to a server during connection handshake. If a server in r-o mode
receives connection request from not r-o client, it rejects the client.<<BR>>This
will involve changes in both Java and C clients.<<BR>>
+   * '''Special case of state transitions.''' If the very first server client connects to
is partitioned server, client receives "fake" session id from it (fake because majority doesn't
knows about this session). When such client eventually connects to r/w server, it receives
valid session id. All this happens transparently to the users. They should just be aware of
fact that sessionId stored in ZooKeeper object could change (iff this is fake sessionId; valid
sessionId will never change).
+   * '''Watches set in r-o mode.''' Client can safely set watches in r-o mode, with the only
obvious caveat that they will be fired when the client reconnects to r/w server. So, if client
connects to partitioned server, sets a watch for data changes of node /a, and meanwhile /a's
data is changed by majority servers, watch will be fired when client reconnects to majority
server.
  
-  * Currently, server still accepts connections even if it's partitioned, but drops them
quickly after they get connected.<<BR>><<BR>>This should be changed
in this way: when server loses a quorum, it doesn't drop read-only clients. It just informs
them about mode change (they will receive notification via their global watcher). After this
it should respond to read requests and throw exceptions to write requests.<<BR>>(For
normal not read-only clients, server will behave as it does now.)<<BR>><<BR>>Implementation
details:<<BR>>Create new subclass of !ZooKeeperServer, !ReadOnlyZooKeeperServer,
which has a pretty simple chain of request processors -- only one !ReadOnlyRequestProcessor,
which answers to read requests and throws exceptions to state-changing operations. When server,
namely !QuorumPeer, loses a quorum it destroys whichever !ZooKeeperServer was running and
goes to LOOKING state (this is a current logic which doesn't need to be altered), and creates
!ReadOnlyZooKeeperServer (new logic). Then, when some client connects to this peer, if running
server sees this is a read-only client, it starts handling its requests; if it's a usual client,
server drops the connection, as it does currently.<<BR>>When the peer reconnects
to the majority, it acts similarly to other state changes: shutdowns zk server (which will
cause notification of all read-only clients about state change), and switches to another state.
+ === Protocol changes ===
+ To make server distinguish these two types of clients, "am-i-readonly-client" field is added
to a packet client sends to a server during connection handshake. If a server in r-o mode
receives connection request from not r-o client, it rejects the client. This is the only protocol
change, so traffic overhead is sizeof boolean per session.<<BR>>This will involve
changes in both Java and C clients.
  
+ === Server-side ===
+ Server-side activity in r-o mode is handled by a subclass of !ZooKeeperServer, !ReadOnlyZooKeeperServer.
Its chain of request processors is similar to leader's chain, but at the beginning it has
!ReadOnlyRequestProcessor which passes read operations but throws exceptions to state-changing
operations.<<BR>><<BR>>When server, namely !QuorumPeer, loses a quorum
it destroys whichever !ZooKeeperServer was running and goes to LOOKING state (this is a current
logic which doesn't need to be altered), and creates !ReadOnlyZooKeeperServer (new logic).
Then, when some client connects to this peer, if running server sees this is a read-only client,
it starts handling its requests; if it's a usual client, server drops the connection, as it
does currently.<<BR>><<BR>>When the peer reconnects to the majority,
it acts similarly to other state changes: shutdowns zk server (which will cause notification
of all read-only clients about state change), and switches to another state.
-  * Recovering from partitioning has two aspects:<<BR>>
-   * On the server side, when server regains a quorum, it should notify all currently connected
read-only clients about mode change; after this it can answer to write requests.
-   * On the client side, read-only client should keep hunting for a server-in-majority, and
if one found, it should reconnect to it, and fire global watcher that mode was changed.
  
- <<BR>>
+ === Recovering from partitioning ===
+  * '''Server side.''' When server regains a quorum, all currently connected r-o clients
are notified about this (see above)
+  * '''Client side.''' Read-only client keeps seeking for r/w server: it goes through the
list of available servers and pings them with newly added "isro" 4-letter command. If answer
is "rw", which means server is r/w, client reconnects to it. This polling backoffs exponentially,
i.e. timeout between successive calls doubles after each call (with upper bound of 60 seconds),
so that if there's no r/w servers available clients don't overload alive ones with too frequent
and unnecessary requests.
  
- Application developers will decide which client to use, i.e. they'll choose whether to have
guaranteed consistent view of system, or agree to sometimes have outdated view in return for
read access.
+ === Backwards compatibility ===
+  * '''New server / old server:'''<<BR>>Server-server protocol remains untouched.
When a new server is in LOOKING state, it also runs !ReadOnlyZooKeeperServer, but this server
doesn't interact with other peers, just with r-o clients. New and old servers are safe to
run together.
+  * '''New client / old server:'''<<BR>>New client sends new "am-i-read-only"
field during connection handshake, along with another information. Old server just ignores
this field as it doesn't know about it. From new client's point of view old server is just
usual r/w server as it never accepts a connection when it's partitioned. In conclusion: new
clients are safe to run against old servers, but obviously r-o mode will not be accessible
in such case, as it's not implemented in old servers.
+  * '''Old client / new server:'''<<BR>>Since new server expects "am-i-read-only"
info during connection handshake and old client doesn't send it, new server just treats old
clients as r/w clients. Which means when new server becomes partitioned it rejects connections
from such  clients. So, old clients will be correctly handled by new servers, without any
inconsistencies in session handling or other aspects.
  
- Enabling read-only mode in current applications will involve changing session handling logic
(they will have to detect new "mode changed" notifications), but since read-only client is
a subclass of current client, interface remains unchanged and transition should be very transparent.
+ === Usage ===
+ Application developers will decide which client to use -- r-o enabled or not -- i.e. they'll
choose whether to have guaranteed consistent view of system, or agree to sometimes have outdated
view in return for read access.<<BR>><<BR>>Enabling read-only mode
in current applications will involve changing session handling logic (they will have to detect
new "mode changed" notifications), but since interface remains unchanged transition should
be very smooth.<<BR>><<BR>>It's worth repeating, despite server side
will be (heavily?) changed, behavior for usual clients remains the same, so there will be
no backwards incompatibility issues introduced.
  
- It's important to note, despite server side will be (heavily?) changed, behavior for usual
clients remains the same, so there will be no backwards incompatibility issues introduced.
+ === Random notes ===
+ Benefits of this design: transparent usage of a new client, backwards compatible.<<BR>>Drawbacks:
more coupling between server and client (but this seems unavoidable in any case).
  
+ === Related links ===
- <<BR>>
- 
- Benefits of this approach: transparent usage of a new client, backwards compatible.
- 
- Drawbacks: more coupling between server and client (but this seems unavoidable in any case).
- 
- <<BR>>
- 
- '''Related links'''
- 
  Jira ticket: [[https://issues.apache.org/jira/browse/ZOOKEEPER-704|ZOOKEEPER-704]]
  
  My GSoC [[http://docs.google.com/View?id=dghqvqdd_51ffvhcsdb|proposal]]

Mime
View raw message