zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bit1129@163.com" <bit1...@163.com>
Subject Re: Re: Question about the two-phrase commit
Date Tue, 06 Jan 2015 06:15:56 GMT
Thanks Alex for the detailed explanation, I understand  much better.

Following the your explanation, another question hits me. Say, that, only one follower A persist
the write to the disk and acks the proposal to the leader,but others don't. 

The all the quorum are restarted. There are two things may happen

1. B,C,D starts first, assume D is the leader.
Then A's last write will be lost because A will sync with D, I think this is OK because, the
client of A didn't get the response that A's write is successful.

2. A,B,C starts first
Assume A is the leader since it has more recent transaction id, then the whole quorum will
have this write because B,C  will sync with A. At last, the whole quorum will have the write.
Is this the expected behavior? 
I don't think so because 1 and 2 are conflicting. In 1, A's write is inaccessible,but in 2,
A's write is accessible.

Is there something that I miss? Thanks.

From: Alexander Shraer
Date: 2015-01-06 13:26
To: user@zookeeper.apache.org
Subject: Re: Question about the two-phrase commit
A few things are not accurate. First, ZooKeeper implements consensus on
each operation, not 2 phase commit.
There are differences in the definition and guaratees of 2PC and Consensus.
> 2. Followers ack the proposal and writes the change to the disk(but not
persisted yet?)
Before acking a follower writes/persists the proposed operation to disk (an
operations log).
> 4. When each follower receives the commit request, follower commits the
changes(persist the change for ever?)
The commit operaiton does not trigger a write to disk. What it does is that
now the state change is applied to an in memory data structure holding
ZooKeeper state. Since reads are served from that in-memory data structure,
the write is now visible to reads.
> d. Assume that When the response from A is  back to client telling the
client that the write is successful, But in the period, the
> other followers (B,C,D) haven't even received the commit request, and
B,C,D are down without getting a chance to commit the
> change.
Whether an operation was "committed" or not is not important during
recovery. What's important is that a quorum acked it. So when you restart
B, C, D, one of them necessarily has this write in its log. The others may
have it or not but in any case, who ever is elected leader will have this
operation in the log. When a leader is established the log is applied to
memory (there are also snapshots which allow truncating the log, so the
snapshot is applied first and then the log). When A syncs with the new
leader, the only thing that it can loose are operations that were not acked
by a quorum previously, and hence was not committed.
On Sun, Jan 4, 2015 at 11:05 PM, bit1129@163.com <bit1129@163.com> wrote:
> In the above process, something rare could happen
> a. Say,there are 5 nodes in the quorum(1 leader E, 4 follower A,B,C,D).
> b. The write operation is issued by the client that connects to Follower A
> c. A commits the changes and response to the client that the writer
> succeeds.
> d. Assume that When the response from A is  back to client telling the
> client that the write is successful, But in the period, the other followers
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message