cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramesh Natarajan <rames...@gmail.com>
Subject Re: Consistency level and ReadRepair
Date Wed, 05 Oct 2011 19:36:52 GMT
Thanks for the explanation. I think i am at loss trying to understand
the tpstats output.. when does the ReadRepair count get incremented?

- When any read is performed with CL < ALL and RF=3 (or)
- When there is a discrepency?

I have 2 snapshots when i run tpstats and the counts indicate there
were 1042805 reads and 354774 ReadRepairs.
All reads are done with consistenct QUORUM. Per documentation should
we do the read repair on all the reads?

ReadStage                         1         1        3533450         0
                0
RequestResponseStage              0         0        7258586         0
                0
MutationStage                     0         1        5056119         0
                0
ReadRepairStage                   0         0        1210754         0
                0


ReadStage                         1         1        4576255         0
                0
RequestResponseStage              0         0        9460969         0
                0
MutationStage                     0         2        6638499         0
                0
ReadRepairStage                   0         0        1565528         0
                0


Read difference: 1042805
ReadRepair difference : 354774

thanks
Ramesh

On Wed, Oct 5, 2011 at 2:21 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> As explained in the link in my earlier reply, "Read Repair" just means
> "a replica was checked in the background," not that it was out of
> sync.
>
> On Wed, Oct 5, 2011 at 2:00 PM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
>> Lets assume we have 3 nodes all up and running at all times with no
>> failures or communication problems.
>> 1. If I have a RF=3 and writing with QUORUM,  2 nodes the change gets
>> committed, what is the delay we should expect before the 3rd replica
>> gets written
>> 2. In this scenario ( no failures e.t.c )  if we do a read with a
>> QUORUM read what situation can lead to read repair? I didn't expect
>> any ReadRepair because all 3 must have the same value.
>>
>>
>> On Wed, Oct 5, 2011 at 1:11 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>> Start with http://wiki.apache.org/cassandra/ReadRepair.  Read repair
>>> count increasing just means you were doing reads at < CL.ALL, and had
>>> the CF configured to perform RR.
>>>
>>> On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan <ramesh25@gmail.com>
wrote:
>>>> I have a 12 node cassandra cluster running with RF=3.  I have severl
>>>> clients ( all running on a single node ) connecting to the cluster (
>>>> fixed client - node mapping ) and try to do a insert, update , select
>>>> and delete. Each client has a fixed mapping of the row-keys and always
>>>> connect to the same node. The timestamp on the client node is used for
>>>> all operations.  All operations are done using CL QUORUM.
>>>>
>>>> When  I run a tpstats I see the ReadRepair count consistently
>>>> increasing. i need to figure out why ReadRepair is happening..
>>>>
>>>> One scenario I can think of is, it could happen when there is a delay
>>>> in updating the nodes to reach eventual consistency..
>>>>
>>>> Let's say I have 3 nodes (RF=3)  A,B,C. I insert  <key> with timestamp
>>>> <ts1> to A and the call will return as soon as it inserts the record
>>>> to A and B. At some later point this information is sent to C...
>>>>
>>>> A while later A,B,C have the same data with the same timestamp.
>>>>
>>>> A <key,ts1>
>>>> B <key, ts1> and
>>>> C <key, ts1>
>>>>
>>>> When I update <key> on A with timestamp <ts2> to A, the call
will
>>>> return as soon as it inserts the record to A and B.
>>>> Now the data is
>>>>
>>>> A <key,ts2>
>>>> B <key,ts2>
>>>> C <key,ts1>
>>>>
>>>> Assuming I query for <key>  A,C respond and since there is no QUORUM,
>>>> it waits for B to respond and when A,B match, the response is returned
>>>> to the client and ReadRepair is sent to C.
>>>>
>>>> This could happen only when C is running behind in catching up the
>>>> updates to A,B.  Are there any stats that would let me know if the
>>>> system is in a consistent state?
>>>>
>>>> thanks
>>>> Ramesh
>>>>
>>>>
>>>> tpstats_2011-10-05_12:50:01:ReadRepairStage                   0
>>>>  0       43569781         0                 0
>>>> tpstats_2011-10-05_12:55:01:ReadRepairStage                   0
>>>>  0       43646420         0                 0
>>>> tpstats_2011-10-05_13:00:02:ReadRepairStage                   0
>>>>  0       43725850         0                 0
>>>> tpstats_2011-10-05_13:05:01:ReadRepairStage                   0
>>>>  0       43790047         0                 0
>>>> tpstats_2011-10-05_13:10:02:ReadRepairStage                   0
>>>>  0       43869704         0                 0
>>>> tpstats_2011-10-05_13:15:01:ReadRepairStage                   0
>>>>  0       43945635         0                 0
>>>> tpstats_2011-10-05_13:20:01:ReadRepairStage                   0
>>>>  0       44020406         0                 0
>>>> tpstats_2011-10-05_13:25:02:ReadRepairStage                   0
>>>>  0       44093227         0                 0
>>>> tpstats_2011-10-05_13:30:01:ReadRepairStage                   0
>>>>  0       44167455         0                 0
>>>> tpstats_2011-10-05_13:35:02:ReadRepairStage                   0
>>>>  0       44247519         0                 0
>>>> tpstats_2011-10-05_13:40:01:ReadRepairStage                   0
>>>>  0       44312726         0                 0
>>>> tpstats_2011-10-05_13:45:01:ReadRepairStage                   0
>>>>  0       44387633         0                 0
>>>> tpstats_2011-10-05_13:50:01:ReadRepairStage                   0
>>>>  0       44443683         0                 0
>>>> tpstats_2011-10-05_13:55:02:ReadRepairStage                   0
>>>>  0       44499487         0                 0
>>>> tpstats_2011-10-05_14:00:01:ReadRepairStage                   0
>>>>  0       44578656         0                 0
>>>> tpstats_2011-10-05_14:05:01:ReadRepairStage                   0
>>>>  0       44647555         0                 0
>>>> tpstats_2011-10-05_14:10:02:ReadRepairStage                   0
>>>>  0       44716730         0                 0
>>>> tpstats_2011-10-05_14:15:01:ReadRepairStage                   0
>>>>  0       44776644         0                 0
>>>> tpstats_2011-10-05_14:20:01:ReadRepairStage                   0
>>>>  0       44840237         0                 0
>>>> tpstats_2011-10-05_14:25:01:ReadRepairStage                   0
>>>>  0       44891444         0                 0
>>>> tpstats_2011-10-05_14:30:01:ReadRepairStage                   0
>>>>  0       44931105         0                 0
>>>> tpstats_2011-10-05_14:35:02:ReadRepairStage                   0
>>>>  0       44976801         0                 0
>>>> tpstats_2011-10-05_14:40:01:ReadRepairStage                   0
>>>>  0       45042220         0                 0
>>>> tpstats_2011-10-05_14:45:01:ReadRepairStage                   0
>>>>  0       45112141         0                 0
>>>> tpstats_2011-10-05_14:50:02:ReadRepairStage                   0
>>>>  0       45177816         0                 0
>>>> tpstats_2011-10-05_14:55:02:ReadRepairStage                   0
>>>>  0       45246675         0                 0
>>>> tpstats_2011-10-05_15:00:01:ReadRepairStage                   0
>>>>  0       45309533         0                 0
>>>> tpstats_2011-10-05_15:05:01:ReadRepairStage                   0
>>>>  0       45357575         0                 0
>>>> tpstats_2011-10-05_15:10:01:ReadRepairStage                   0
>>>>  0       45405943         0                 0
>>>> tpstats_2011-10-05_15:15:01:ReadRepairStage                   0
>>>>  0       45458435         0                 0
>>>> tpstats_2011-10-05_15:20:01:ReadRepairStage                   0
>>>>  2       45508253         0                 0
>>>> tpstats_2011-10-05_15:25:01:ReadRepairStage                   0
>>>>  0       45570375         0                 0
>>>> tpstats_2011-10-05_15:30:01:ReadRepairStage                   0
>>>>  0       45628426         0                 0
>>>> tpstats_2011-10-05_15:35:01:ReadRepairStage                   0
>>>>  0       45688694         0                 0
>>>> tpstats_2011-10-05_15:40:01:ReadRepairStage                   0
>>>>  3       45743029         0                 0
>>>> tpstats_2011-10-05_15:45:02:ReadRepairStage                   0
>>>>  0       45801167         0                 0
>>>> tpstats_2011-10-05_15:50:02:ReadRepairStage                   0
>>>>  0       45837329         0                 0
>>>> tpstats_2011-10-05_15:55:01:ReadRepairStage                   0
>>>>  0       45890326         0                 0
>>>> tpstats_2011-10-05_16:00:01:ReadRepairStage                   0
>>>>  0       45951703         0                 0
>>>> tpstats_2011-10-05_16:05:02:ReadRepairStage                   0
>>>>  0       46010736         0                 0
>>>> tpstats_2011-10-05_16:10:01:ReadRepairStage                   0
>>>>  0       46063294         0                 0
>>>> tpstats_2011-10-05_16:15:01:ReadRepairStage                   0
>>>>  0       46108327         0                 0
>>>> tpstats_2011-10-05_16:20:01:ReadRepairStage                   0
>>>>  0       46142291         0                 0
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Mime
View raw message