Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 22 Dec 2015 15:26:46 +0000 (UTC)
From: "Jonathan Ellis (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12913901.1447799068000.93575.1450798006931@Atlassian.JIRA>
In-Reply-To: <JIRA.12913901.1447799068000@Atlassian.JIRA>
References: <JIRA.12913901.1447799068000@Atlassian.JIRA>
 <JIRA.12913901.1447799068205@arcas>
Subject: [jira] [Commented] (CASSANDRA-10726) Read repair inserts should not
 be blocking
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=3Dcom.atla=
ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=
=3D15068248#comment-15068248 ]=20

Jonathan Ellis commented on CASSANDRA-10726:
--------------------------------------------

Seeing reads "go backwards in time" is one of the most confusing aspects of=
 eventual consistency for people, so I do think it's important that quorum =
reads avoid that, even more so because users tend to oversimplify quorum re=
ads as "strong consistency that means I don't have to think about EC."  So =
to the degree we can make that assumption true, we should, especially if th=
at's been our behavior already for 4+ years.

It seems like there are two primary problem scenarios:

* When a node is overloaded for writes, this stops reads as well.  First, d=
elaying reads when we're behind on writes is arguably a good thing that wil=
l help you recover faster.  Second, the right way to tackle this is with be=
tter handling of the write overload as in CASANDRA-9318.
* When data is read-only because disks are failing.  I agree with Sylvain t=
hat half-broken is often worse than completely broken, and in this specific=
 case if a disk puts itself in read-only mode then it won't be long until i=
t isn't readable either.  This is another case where "mark a disk bad and b=
roadcast to other nodes not to send me requests for tokens pinned to it" as=
 envisioned in CASSANDRA-6696 would be useful, along with an option for "pr=
omote write errors to blacklist on reads as wells."

> Read repair inserts should not be blocking
> ------------------------------------------
>
>                 Key: CASSANDRA-10726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1072=
6
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
>
> Today, if there=E2=80=99s a digest mismatch in a foreground read repair, =
the insert to update out of date replicas is blocking. This means, if it fa=
ils, the read fails with a timeout. If a node is dropping writes (maybe it =
is overloaded or the mutation stage is backed up for some other reason), al=
l reads to a replica set could fail. Further, replicas dropping writes get =
more out of sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on a=
ny replica that's
> // behind on writes in case the out-of-sync row is read multiple times in=
 quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should n=
ot be blocking or we should return success for the read even if the write t=
imes out.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)