cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6866) Read repair path of quorum reads makes cluster to timeout all requests under load
Date Fri, 28 Aug 2015 16:58:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720229#comment-14720229
] 

Jeremiah Jordan commented on CASSANDRA-6866:
--------------------------------------------

global read repair is no longer enabled by default.

> Read repair path of quorum reads makes cluster to timeout all requests under load
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6866
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6866
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Oleg Anastasyev
>         Attachments: ReadRepairPathFixExample.txt, Read_Latency__2data___digest_vs_3_data__-_99_.png
>
>
> Current implementation of read repair path for quorum reads is:
> 1. request data from 1 or 2 endpoints; request digest from others.
> 2. compare digests; throw DigestMismatchEx
> 3. request data form all contacted replicas with CL.ALL
> 4. prepare read repairs; send mutations
> 5. wait for all mutations to ack
> 6. retry read and prepare result.
> The main problem is in p. 3 ( still p. 5 is not good as well ). This is because any of
endpoints can go down but are not known to be down yet while executing this.
> So, if you have a noticeable amount of read repair happening (shortly after rack of nodes
started up for example), waiting on CL.ALL and acks of RR mutations of not-yet-known-to-be-down
endpoints quickly occupy all client thread pools on all nodes, so cluster becomes unavailable.
> This also make (otherwise successful) reads timeout from time to time even under light
load of the cluster, just because of temporary hiccups on net or GC on a single endpoint.
> I do not have a generic solution for this; I fixed it in a way, which is  appropriate
for us - using always speculative retry policy; patching it to make data requests only (no
digests) and do read repair on data at once (not requesting them again). This way yet-not-known-to-be-down
endpoints are just not responing to data requests, so further read repair path does not contact
them at all.
> I attached my patch here for illustration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message