Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 63098200C5D for ; Fri, 7 Apr 2017 12:26:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 61940160BA2; Fri, 7 Apr 2017 10:26:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A7D08160B93 for ; Fri, 7 Apr 2017 12:26:45 +0200 (CEST) Received: (qmail 90303 invoked by uid 500); 7 Apr 2017 10:26:44 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 90291 invoked by uid 99); 7 Apr 2017 10:26:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Apr 2017 10:26:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 5BE6A189BFA for ; Fri, 7 Apr 2017 10:26:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pPatJLs2P853 for ; Fri, 7 Apr 2017 10:26:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 93E3E5FD6D for ; Fri, 7 Apr 2017 10:26:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E7DC5E02AA for ; Fri, 7 Apr 2017 10:26:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A508224066 for ; Fri, 7 Apr 2017 10:26:41 +0000 (UTC) Date: Fri, 7 Apr 2017 10:26:41 +0000 (UTC) From: "Stefan Podkowinski (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-12126) CAS Reads Inconsistencies MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 07 Apr 2017 10:26:46 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=3Dcom.atl= assian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12126: ------------------------------------------- Status: Patch Available (was: In Progress) > CAS Reads Inconsistencies=20 > -------------------------- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1212= 6 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Reporter: sankalp kohli > Assignee: Stefan Podkowinski > > While looking at the CAS code in Cassandra, I found a potential issue wit= h CAS Reads. Here is how it can happen with RF=3D3 > 1) You issue a CAS Write and it fails in the propose phase. A machine rep= lies true to a propose and saves the commit in accepted filed. The other tw= o machines B and C does not get to the accept phase.=20 > Current state is that machine A has this commit in paxos table as accepte= d but not committed and B and C does not.=20 > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read= the value written in step 1. This step is as if nothing is inflight.=20 > 3) Issue another CAS Read and it goes to A and B. Now we will discover th= at there is something inflight from A and will propose and commit it with t= he current ballot. Now we can read the value written in step 1 as part of t= his CAS read. > If we skip step 3 and instead run step 4, we will never learn about value= written in step 1.=20 > 4. Issue a CAS Write and it involves only B and C. This will succeed and = commit a different value than step 1. Step 1 value will never be seen again= and was never seen before.=20 > If you read the Lamport =E2=80=9Cpaxos made simple=E2=80=9D paper and rea= d section 2.3. It talks about this issue which is how learners can find out= if majority of the acceptors have accepted the proposal.=20 > In step 3, it is correct that we propose the value again since we dont kn= ow if it was accepted by majority of acceptors. When we ask majority of acc= eptors, and more than one acceptors but not majority has something in fligh= t, we have no way of knowing if it is accepted by majority of acceptors. So= this behavior is correct.=20 > However we need to fix step 2, since it caused reads to not be linearizab= le with respect to writes and other reads. In this case, we know that major= ity of acceptors have no inflight commit which means we have majority that = nothing was accepted by majority. I think we should run a propose step here= with empty commit and that will cause write written in step 1 to not be vi= sible ever after.=20 > With this fix, we will either see data written in step 1 on next serial r= ead or will never see it which is what we want.=20 -- This message was sent by Atlassian JIRA (v6.3.15#6346)