cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Slater <>
Subject Re: Read Repairs and CL
Date Tue, 30 Aug 2016 11:04:01 GMT
Thanks Sam - a couple of subtleties there that we missed in our review.


On Tue, 30 Aug 2016 at 19:42 Sam Tunnicliffe <> wrote:

> Just to clarify a little further, it's true that read repair queries are
> performed at CL ALL, but this is slightly different to a regular,
> user-initiated query at that CL.
> Say you have RF=5 and you issue read at CL ALL, the coordinator will send
> requests to all 5 replicas and block until it receives a response from each
> (or a timeout occurs) before replying to the client. This is the
> straightforward and intuitive case.
> If instead you read at CL QUORUM, the # of replicas required for CL is 3,
> so the coordinator only contacts 3 nodes. In the case where a speculative
> retry is activated, an additional replica is added to the initial set. The
> coordinator will still only wait for 3 out of the 4 responses before
> proceeding, but if a digest mismatch occurs the read repair queries are
> sent to all 4. It's this follow up query that the coordinator executes at
> CL ALL, i.e. it requires all 4 replicas to respond to the read repair query
> before merging their results to figure out the canonical, latest data.
> You can see that the number of replicas queried/required for read repair
> is different than if the client actually requests a read at CL ALL (i.e.
> here it's 4, not 5), it's the behaviour of waiting for all *contacted*
> replicas to respond which is significant here.
> There are addtional considerations when constructing that initial replica
> set (which you can follow in
> o.a.c.Service.AbstractReadExecutor::getReadExecutor), involving the table's
> read_repair_chance, dclocal_read_repair_chance and speculative_retry
> options. THe main gotcha is global read repair (via read_repair_chance)
> which will trigger cross-dc repairs at CL ALL in the case of a digest
> mismatch, even if the requested CL is DC-local.
> On Sun, Aug 28, 2016 at 11:55 AM, Ben Slater <>
> wrote:
>> In case anyone else is interested - we figured this out. When C* decides
>> it need to do a repair based on a digest mismatch from the initial reads
>> for the consistency level it does actually try to do a read at CL=ALL in
>> order to get the most up to date data to use to repair.
>> This led to an interesting issue in our case where we had one node in an
>> RF3 cluster down for maintenance (to correct data that became corrupted due
>> to a severe write overload) and started getting occasional “timeout during
>> read query at consistency LOCAL_QUORUM” failures. We believe this due to
>> the case where data for a read was only available on one of the two up
>> replicas which then triggered an attempt to repair and a failed read at
>> CL=ALL. It seems that CASSANDRA-7947 (a while ago) change the behaviour so
>> that C* reports a failure at the originally request level even when it was
>> actually the attempted repair read at CL=ALL which could not read
>> sufficient replicas - a bit confusing (although I can also see how getting
>> CL=ALL errors when you thought you were reading at QUORUM or ONE would be
>> confusing).
>> Cheers
>> Ben
>> On Sun, 28 Aug 2016 at 10:52 kurt Greaves <> wrote:
>>> Looking at the wiki for the read path (
>>>, in the bottom
>>> diagram for reading with a read repair, it states the following when
>>> "reading from all replica nodes" after there is a hash mismatch:
>>> If hashes do not match, do conflict resolution. First step is to read
>>>> all data from all replica nodes excluding the fastest replica (since CL=ALL)
>>>  In the bottom left of the diagram it also states:
>>>> In this example:
>>> RF>=2
>>> CL=ALL
>>> The (since CL=ALL) implies that the CL for the read during the read
>>> repair is based off the CL of the query. However I don't think that makes
>>> sense at other CLs. Anyway, I just want to clarify what CL the read for the
>>> read repair occurs at for cases where the overall query CL is not ALL.
>>> Thanks,
>>> Kurt.
>>> --
>>> Kurt Greaves
>> --
>> ————————
>> Ben Slater
>> Chief Product Officer
>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>> +61 437 929 798
> --
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

View raw message