cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaolong Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
Date Mon, 13 Mar 2017 23:43:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923200#comment-15923200
] 

Xiaolong Jiang commented on CASSANDRA-10726:
--------------------------------------------

The patch is trying to do 2 things:
1. Before, when we read, say quorum read, let  RF = 3 (replica1, replica2, replica3), so the
client request is trying to read from 2 replicas (replica1, replica2), but there is a digest
mismatch between these 2 replicas, so read repair will kick in. Let's say the stale data is
in replica2, read repair will send the correct data to replica2. But for some reason, the
write request got timeout, then we send "read timeout " to client side. 
After this patch, we will wait for replica2 write for some time, if it didn't come back, correct
data is sent to replica3 no matter whether replica3 already has latest data or not. Because
we know if replica3 write succeeds, it's guaranteed 2 replicas got the correct data, client
will return success with data for read request, and next time the quorum read will definitely
read correct data.

2. The second thing this patch is trying to do is to make sure in read repair part, we don't
block for replicas beyond what is needed for consistency level to reply back in speculative
retry/read repair chance case. For example, we still use above RF = 3 quorum read case, it's
trying to read from replica1 and replica2, but replica2 is slow, then speculative retry kicks
in, read will try to read replica3, then all 3 replicas read come back, but there is digest
mismatch, both replica2 and replica3 are stale data, what happens before is read repair will
block for both replica2 and replica3 to finish read repair, but there is no need to wait for
both to come back, we only need to wait for one repair to come back since we only need one
successful repair to guarantee successful quorum read. And next quorum read will definitely
read latest data even replica 3 read repair failed.    This is applied same to read repiar
chance. Let's say the read repair chance is "GLOBAL", we don't need to block for all replicas
to finish repair, we only need to block what the read consistency level needs. 

> Read repair inserts should not be blocking
> ------------------------------------------
>
>                 Key: CASSANDRA-10726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
>            Assignee: Xiaolong Jiang
>             Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert to update
out of date replicas is blocking. This means, if it fails, the read fails with a timeout.
If a node is dropping writes (maybe it is overloaded or the mutation stage is backed up for
some other reason), all reads to a replica set could fail. Further, replicas dropping writes
get more out of sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any replica that's
> // behind on writes in case the out-of-sync row is read multiple times in quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not be blocking
or we should return success for the read even if the write times out.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message