cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergio Bossa (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5062) Support CAS
Date Wed, 27 Feb 2013 14:39:15 GMT


Sergio Bossa commented on CASSANDRA-5062:

Thanks for clarifying, [~slebresne].

{quote}what happens if when the coordinator sends the commits to replicas, but only a minority
of replicas get that commit (say 1 of 3 replica got it (and persist it), the two other dies
between the prepare and commit phase). And later on, the 2 replica get back up while the 3rd
one now dies, and we do a new CAS (that would have a majority and so should work).{quote}

The Zab deviation from standard 2PC here is that the coordinator doesn't need to wait for
the ack from replicas on commit phase.
If a replica fails during prepare phase, it will just be out of quorum.
If a replica fails after prepare but before completing the commit, it will recover later from
the leader: so in your example, when 2 and 3 come up, they will join the leader which may
hint them the correct values.
If the third replica died in your example was actually the coordinator, a new coordinator
will be elected among the ones that have seen either the last commit or the latest *proposed*
commit, which will become committed.
So there's no lost-ack problem as there's actually no ack at all in the commit phase: it will
be "eventually" committed or recovered.

By the way, I'm not saying this is better than Paxos for sure: I just *think* this is easier
and more practical (which yes doesn't mean can be implemented easily on top of Cassandra).
> Support CAS
> -----------
>                 Key: CASSANDRA-5062
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
> "Strong" consistency is not enough to prevent race conditions.  The classic example is
user account creation: we want to ensure usernames are unique, so we only want to signal account
creation success if nobody else has created the account yet.  But naive read-then-write allows
clients to race and both think they have a green light to create.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message