zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fansong Zeng <fanste...@gmail.com>
Subject Re: Question about the two-phrase commit
Date Tue, 06 Jan 2015 05:28:23 GMT
persist happens in 2.

2015-01-05 18:55 GMT+08:00 Rakesh R <rakeshr@huawei.com>:

> Hi,
>
> In your case only A and E has committed the latest transaction say am
> calling it as txid=1000. B, C, D servers are down at this time and doesn't
> have the changes of txid=1000.
> Also, when restarting B,C,D the servers A, E are not available. Now the
> newly elected Leader is seeing atmost txid=999 and when A, E rejoins the
> quorum it will 'truncate' himself by deleting the txid=1000. As you said,
> the write operation performed will be lost in this case.
>
> I could see this is a kinda tricky case of double failures or multiple
> failures. But I agree this can happen.
> My point is, if user wants to maintain a reliable cluster then he should
> keep in mind that the failures more than the tolerated number of failures
> may leads to unexpected results like this.
>
>
> Best Regards,
> Rakesh
> -----Original Message-----
> From: bit1129@163.com [mailto:bit1129@163.com]
> Sent: 05 January 2015 15:56
> To: user@zookeeper.apache.org
> Subject: Re: Question about the two-phrase commit
>
> Could someone help on this question? Thanks.
>
>
>
> bit1129@163.com
>
> From: bit1129@163.com
> Date: 2015-01-05 15:05
> To: user@zookeeper.apache.org
> Subject: Question about the two-phrase commit
>
> Hi,Zookeepers,
>
> I got a question about the two phrase commit in Zookeeper. When a write
> operation happens
>
> 1. Leader proposes all the followers to accept the change(Proposal Vote
> phrase) 2. Followers ack the proposal and writes the change to the disk(but
> not persisted yet?) 3. When the Leader receives the majority of acks from
> followers, the Leader asks the followers to commit the change 4. When each
> follower receives the commit request, follower commits the changes(persist
> the change for ever?)
>
> In the above process, something rare could happen a. Say,there are 5 nodes
> in the quorum(1 leader E, 4 follower A,B,C,D).
> b. The write operation is issued by the client that connects to Follower A
> c. A commits the changes and response to the client that the writer
> succeeds.
> d. Assume that When the response from A is  back to client telling the
> client that the write is successful, But in the period, the other followers
> (B,C,D) haven't even received the commit request, and B,C,D are down
> without getting a chance to commit the change.
>
>
> Then shut down A and E.
>  Restart B,C,D,making sure that they will elect a leader.and A start
> later(A's latest tranactions will be lost,because A will sync with Lead).
>
> When this is done, the write operation done before is lost?
>
> Is there anything I miss in the above process? Thanks.
>
>
>
>
>
> bit1129@163.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message