Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A88818482 for ; Tue, 22 Dec 2015 15:26:47 +0000 (UTC) Received: (qmail 63136 invoked by uid 500); 22 Dec 2015 15:26:47 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 63097 invoked by uid 500); 22 Dec 2015 15:26:47 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 63045 invoked by uid 99); 22 Dec 2015 15:26:47 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2015 15:26:47 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E46522C1F62 for ; Tue, 22 Dec 2015 15:26:46 +0000 (UTC) Date: Tue, 22 Dec 2015 15:26:46 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D15068248#comment-15068248 ]=20 Jonathan Ellis commented on CASSANDRA-10726: -------------------------------------------- Seeing reads "go backwards in time" is one of the most confusing aspects of= eventual consistency for people, so I do think it's important that quorum = reads avoid that, even more so because users tend to oversimplify quorum re= ads as "strong consistency that means I don't have to think about EC." So = to the degree we can make that assumption true, we should, especially if th= at's been our behavior already for 4+ years. It seems like there are two primary problem scenarios: * When a node is overloaded for writes, this stops reads as well. First, d= elaying reads when we're behind on writes is arguably a good thing that wil= l help you recover faster. Second, the right way to tackle this is with be= tter handling of the write overload as in CASANDRA-9318. * When data is read-only because disks are failing. I agree with Sylvain t= hat half-broken is often worse than completely broken, and in this specific= case if a disk puts itself in read-only mode then it won't be long until i= t isn't readable either. This is another case where "mark a disk bad and b= roadcast to other nodes not to send me requests for tokens pinned to it" as= envisioned in CASSANDRA-6696 would be useful, along with an option for "pr= omote write errors to blacklist on reads as wells." > Read repair inserts should not be blocking > ------------------------------------------ > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1072= 6 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Reporter: Richard Low > > Today, if there=E2=80=99s a digest mismatch in a foreground read repair, = the insert to update out of date replicas is blocking. This means, if it fa= ils, the read fails with a timeout. If a node is dropping writes (maybe it = is overloaded or the mutation stage is backed up for some other reason), al= l reads to a replica set could fail. Further, replicas dropping writes get = more out of sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on a= ny replica that's > // behind on writes in case the out-of-sync row is read multiple times in= quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should n= ot be blocking or we should return success for the read even if the write t= imes out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)