http://zookeeper.apache.org/doc/r3.2.2/zookeeperInternals.html#sc_guaranteesPropertiesDefinitions
describes the ordering properties,
but it's still not exactly immediately apparent.
I have been thinking about this, and find the following assertion may
help make it clearer:
each node sees a sequence of messages including proposals (Pi) and commits (Ci):
we use "n{ X < Y}" to mean that message X appears before Y in the
sequence of node n;
0) for any messages X , Y , and nodes n,m, if n{X < Y } and Y
appears in m, then m{ X < Y} . this is because the leader sends
out messages in the same order to every node, and TCP connection
preserves order. in other words, shorter sequence on node n always
forms a prefix of a longer sequence on node m.
1) for any proposal number i, and node n, n{Pi < Ci} because
you can only commit after you have seen
2) for any proposal numbers i < j , and node n, n{Pi < Pj} , n{Ci
< Cj} because
a) leader{ Pi < Pj} leader always proposes in order
b) follower{Pi < Pj} due to 0)
c) leader{Ci < Cj} Zookeeper followers handles
proposals and in order, and FIFO
d) follower{Ci < Cj} due to 0
3) for any proposal numbers i<j, and node n, n{Ci<Pj} or n{Pj
<Ci} because leader sends out multiple proposals inflight
for example P1 P2 P3 arrive at follower, and the follower
sends back C1 C2 C3,
when the leader receives the commits, any merging of these 2
sequences are valid.
