cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ibrahim El-sanosi <ibrahimsaba...@gmail.com>
Subject Re: lightweight transactions with potential problem?
Date Thu, 27 Aug 2015 10:08:55 GMT
Hi Sylvain and all folks,

I have another scenario in my mind where *linearizable consistency (CAS,
Compare-and-Set) *can fail as we *the following round-trips:*

*1.*      *Prepare/promise*

*2.*      *Read/result*

*3.*      *Propose/accept*

4.      *Commit/acknowledgment *

Assume we have an application for resistering new account, I want to make
sure I only allow exactly one user to claim a given account. For example,
we do not allow two users having the same username.

Assuming we have a cluster consist of 5 nodes N1, N2, N3, N4, and N5. We
have two concurrent clients C1 and C2. We have replication factor 3 and the
partitioner has determined the primary and the replicas nodes of the INSERT
example are N3, N4, and N5.



The scenario happens in following order:

1.      C1 connects to coordinator N1 and sends INSERT  V1 (assume V1 is
username, not resister before)

2.      N1 sends PREPARE message with ballot 1 (highest ballot have seen)
to N3, N4 and N5. Note that this prepare for C1 and V1.

3. Now C2 connects to coordinator N2 and sends INSERT  V1.

4. N2 sends PREPARE message with ballot 2 (highest ballot after re-prepare
because first time, N2 does not know about ballot 1, then eventual it
solves and have ballot 2) to N3, N4 and N5. Note that this prepare for C2
and V1.

*5.*    *N1  sends READ message to N3, N4 and N5 to read V1.*

*6.*    *N3, N4 and N5 send RESULT message to N1, informing that V1 not
exist which results in N1 will go forward to next round.*

*7.*     * N2  sends READ message to N3, N4 and N5 to read V1.*

*8.*   *N3, N4 and N5 send RESULT message to N2, informing that V1 not
exist which results in N2 will go forward to next round.*

9.   Now N1 send PROPOSE message to  N3, N4 and N5 (ballot 1, V1).

10.  N3, N4 and N5 send ACCEPT message to N1.

11.  N2 send PROPOSE message to  N3, N4 and N5 (ballot 2, V1).

12.  N3, N4 and N5 send ACCEPT message to N2.

13.  N1 send COMMIT message to  N3, N4 and N5 (ballot 1).

14.   N3, N4 and N5 send ACK message to N1.

15.   N2 send COMMIT message to  N3, N4 and N5 (ballot 2).

16.  N3, N4 and N5 send ACK message to N2.



As result, both V1 from client C1 and V1 from client C2 have written to
replicas N3, N4, and N5. Which I think it does not achieve the goal of
*linearizable
consistency and CAS. *



*Is that true and such scenario could be occurred?*



I look forward to hearing from you.



Regards,


Ibrahim

On Wed, Aug 26, 2015 at 12:19 PM, ibrahim El-sanosi <
ibrahimsabattt@gmail.com> wrote:

> Thank you lot
>
> Ibrahim
>
> On Wed, Aug 26, 2015 at 12:15 PM, Sylvain Lebresne <sylvain@datastax.com>
> wrote:
>
>> Yes
>>
>> On Wed, Aug 26, 2015 at 1:05 PM, ibrahim El-sanosi <
>> ibrahimsabattt@gmail.com> wrote:
>>
>>> OKKKKKKKKK. I see what the purpose of acknowledgment round here. So
>>> acknowledgment is optional here, depend on CL setting as we normally do in
>>> Cassandra.
>>> So we can say that acknowledgment is not really related to Paxos phase,
>>> it depends on CL in Cassandra?
>>>
>>> Ibrahim
>>>
>>> On Wed, Aug 26, 2015 at 11:50 AM, Sylvain Lebresne <sylvain@datastax.com
>>> > wrote:
>>>
>>>> On Wed, Aug 26, 2015 at 12:19 PM, ibrahim El-sanosi <
>>>> ibrahimsabattt@gmail.com> wrote:
>>>>
>>>>> Yes, Sylvain, your answer makes more sense. The phase is in Paxos
>>>>> protocol sometimes called learning or decide phase, BUT this phase does
not
>>>>> have acknowledgment round, just learning or decide message from the
>>>>> proposer to learners. So why we need acknowledgment round with commit
phase
>>>>> in lightweight transactions?
>>>>>
>>>>
>>>> It's not _needed_ as far as Paxos is concerned. But it's useful in the
>>>> context of Cassandra. The commit phase is about actually persisting to
>>>> replica the update decided by the Paxos algorithm and thus making that
>>>> update visible to non paxos reads. Being able to apply normal consistencies
>>>> to this phase is thus useful, since it allows user to get visibility
>>>> guarantees even for non-paxos reads if they so wish, and that's exactly
>>>> what we do and why we optionally wait on acknowledgments (and I say
>>>> optionally because how many acks we wait on depends on the user provided
>>>> consistency level and if that's CL.ANY then the whole Paxos operation
>>>> actually return without waiting on any of those acks).
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message