Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A10AEEE50 for ; Wed, 27 Feb 2013 14:51:14 +0000 (UTC) Received: (qmail 68992 invoked by uid 500); 27 Feb 2013 14:51:14 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 68859 invoked by uid 500); 27 Feb 2013 14:51:14 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 68851 invoked by uid 99); 27 Feb 2013 14:51:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2013 14:51:14 +0000 Date: Wed, 27 Feb 2013 14:51:14 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-5062) Support CAS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588400#comment-13588400 ] Jonathan Ellis commented on CASSANDRA-5062: ------------------------------------------- bq. probably the coordinator should hint something when he don't get the commit-ack from the 2 replicas that died This is racy, though; if the coordinator also dies, then we still lose. FWIW, Spinnaker's solution is actually pretty dicey here too: the leader does 2PC, and if the leader does not get a majority of acks back to it's proposal, it will return fail the op. But, it doesn't actually abort or revert the proposal on the followers. (And if it tried, it would still be open to a race, where it fails before aborting, leaving some proposals extant.) Then, when a new leader is elected, it replays the proposals it has not yet committed. So a proposal that originally failed, and was returned as such to the client, could succeed after failover. I think Sergio's proposal has a similar problem: if the leader reports success to the client after local commit, but before it has been committed to the followers, we could either (1) lose the commit on failover if followers are pessimistic, or (2) commit data that we originally reported failed as in Spinnaker if we are optimistic. On the other hand if the leader tries to wait for commit ack from followers before reporting to the client it could block indefinitely during a partition, so that is no solution either. > Support CAS > ----------- > > Key: CASSANDRA-5062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5062 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Jonathan Ellis > Fix For: 2.0 > > > "Strong" consistency is not enough to prevent race conditions. The classic example is user account creation: we want to ensure usernames are unique, so we only want to signal account creation success if nobody else has created the account yet. But naive read-then-write allows clients to race and both think they have a green light to create. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira