cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
Date Tue, 22 Dec 2015 15:26:46 GMT


Jonathan Ellis commented on CASSANDRA-10726:

Seeing reads "go backwards in time" is one of the most confusing aspects of eventual consistency
for people, so I do think it's important that quorum reads avoid that, even more so because
users tend to oversimplify quorum reads as "strong consistency that means I don't have to
think about EC."  So to the degree we can make that assumption true, we should, especially
if that's been our behavior already for 4+ years.

It seems like there are two primary problem scenarios:

* When a node is overloaded for writes, this stops reads as well.  First, delaying reads when
we're behind on writes is arguably a good thing that will help you recover faster.  Second,
the right way to tackle this is with better handling of the write overload as in CASANDRA-9318.
* When data is read-only because disks are failing.  I agree with Sylvain that half-broken
is often worse than completely broken, and in this specific case if a disk puts itself in
read-only mode then it won't be long until it isn't readable either.  This is another case
where "mark a disk bad and broadcast to other nodes not to send me requests for tokens pinned
to it" as envisioned in CASSANDRA-6696 would be useful, along with an option for "promote
write errors to blacklist on reads as wells."

> Read repair inserts should not be blocking
> ------------------------------------------
>                 Key: CASSANDRA-10726
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
> Today, if there’s a digest mismatch in a foreground read repair, the insert to update
out of date replicas is blocking. This means, if it fails, the read fails with a timeout.
If a node is dropping writes (maybe it is overloaded or the mutation stage is backed up for
some other reason), all reads to a replica set could fail. Further, replicas dropping writes
get more out of sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any replica that's
> // behind on writes in case the out-of-sync row is read multiple times in quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not be blocking
or we should return success for the read even if the write times out.

This message was sent by Atlassian JIRA

View raw message