cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramesh Natarajan <rames...@gmail.com>
Subject Re: Consistency level and ReadRepair
Date Wed, 05 Oct 2011 19:00:32 GMT
Lets assume we have 3 nodes all up and running at all times with no
failures or communication problems.
1. If I have a RF=3 and writing with QUORUM,  2 nodes the change gets
committed, what is the delay we should expect before the 3rd replica
gets written
2. In this scenario ( no failures e.t.c )  if we do a read with a
QUORUM read what situation can lead to read repair? I didn't expect
any ReadRepair because all 3 must have the same value.


On Wed, Oct 5, 2011 at 1:11 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> Start with http://wiki.apache.org/cassandra/ReadRepair.  Read repair
> count increasing just means you were doing reads at < CL.ALL, and had
> the CF configured to perform RR.
>
> On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
>> I have a 12 node cassandra cluster running with RF=3.  I have severl
>> clients ( all running on a single node ) connecting to the cluster (
>> fixed client - node mapping ) and try to do a insert, update , select
>> and delete. Each client has a fixed mapping of the row-keys and always
>> connect to the same node. The timestamp on the client node is used for
>> all operations.  All operations are done using CL QUORUM.
>>
>> When  I run a tpstats I see the ReadRepair count consistently
>> increasing. i need to figure out why ReadRepair is happening..
>>
>> One scenario I can think of is, it could happen when there is a delay
>> in updating the nodes to reach eventual consistency..
>>
>> Let's say I have 3 nodes (RF=3)  A,B,C. I insert  <key> with timestamp
>> <ts1> to A and the call will return as soon as it inserts the record
>> to A and B. At some later point this information is sent to C...
>>
>> A while later A,B,C have the same data with the same timestamp.
>>
>> A <key,ts1>
>> B <key, ts1> and
>> C <key, ts1>
>>
>> When I update <key> on A with timestamp <ts2> to A, the call will
>> return as soon as it inserts the record to A and B.
>> Now the data is
>>
>> A <key,ts2>
>> B <key,ts2>
>> C <key,ts1>
>>
>> Assuming I query for <key>  A,C respond and since there is no QUORUM,
>> it waits for B to respond and when A,B match, the response is returned
>> to the client and ReadRepair is sent to C.
>>
>> This could happen only when C is running behind in catching up the
>> updates to A,B.  Are there any stats that would let me know if the
>> system is in a consistent state?
>>
>> thanks
>> Ramesh
>>
>>
>> tpstats_2011-10-05_12:50:01:ReadRepairStage                   0
>>  0       43569781         0                 0
>> tpstats_2011-10-05_12:55:01:ReadRepairStage                   0
>>  0       43646420         0                 0
>> tpstats_2011-10-05_13:00:02:ReadRepairStage                   0
>>  0       43725850         0                 0
>> tpstats_2011-10-05_13:05:01:ReadRepairStage                   0
>>  0       43790047         0                 0
>> tpstats_2011-10-05_13:10:02:ReadRepairStage                   0
>>  0       43869704         0                 0
>> tpstats_2011-10-05_13:15:01:ReadRepairStage                   0
>>  0       43945635         0                 0
>> tpstats_2011-10-05_13:20:01:ReadRepairStage                   0
>>  0       44020406         0                 0
>> tpstats_2011-10-05_13:25:02:ReadRepairStage                   0
>>  0       44093227         0                 0
>> tpstats_2011-10-05_13:30:01:ReadRepairStage                   0
>>  0       44167455         0                 0
>> tpstats_2011-10-05_13:35:02:ReadRepairStage                   0
>>  0       44247519         0                 0
>> tpstats_2011-10-05_13:40:01:ReadRepairStage                   0
>>  0       44312726         0                 0
>> tpstats_2011-10-05_13:45:01:ReadRepairStage                   0
>>  0       44387633         0                 0
>> tpstats_2011-10-05_13:50:01:ReadRepairStage                   0
>>  0       44443683         0                 0
>> tpstats_2011-10-05_13:55:02:ReadRepairStage                   0
>>  0       44499487         0                 0
>> tpstats_2011-10-05_14:00:01:ReadRepairStage                   0
>>  0       44578656         0                 0
>> tpstats_2011-10-05_14:05:01:ReadRepairStage                   0
>>  0       44647555         0                 0
>> tpstats_2011-10-05_14:10:02:ReadRepairStage                   0
>>  0       44716730         0                 0
>> tpstats_2011-10-05_14:15:01:ReadRepairStage                   0
>>  0       44776644         0                 0
>> tpstats_2011-10-05_14:20:01:ReadRepairStage                   0
>>  0       44840237         0                 0
>> tpstats_2011-10-05_14:25:01:ReadRepairStage                   0
>>  0       44891444         0                 0
>> tpstats_2011-10-05_14:30:01:ReadRepairStage                   0
>>  0       44931105         0                 0
>> tpstats_2011-10-05_14:35:02:ReadRepairStage                   0
>>  0       44976801         0                 0
>> tpstats_2011-10-05_14:40:01:ReadRepairStage                   0
>>  0       45042220         0                 0
>> tpstats_2011-10-05_14:45:01:ReadRepairStage                   0
>>  0       45112141         0                 0
>> tpstats_2011-10-05_14:50:02:ReadRepairStage                   0
>>  0       45177816         0                 0
>> tpstats_2011-10-05_14:55:02:ReadRepairStage                   0
>>  0       45246675         0                 0
>> tpstats_2011-10-05_15:00:01:ReadRepairStage                   0
>>  0       45309533         0                 0
>> tpstats_2011-10-05_15:05:01:ReadRepairStage                   0
>>  0       45357575         0                 0
>> tpstats_2011-10-05_15:10:01:ReadRepairStage                   0
>>  0       45405943         0                 0
>> tpstats_2011-10-05_15:15:01:ReadRepairStage                   0
>>  0       45458435         0                 0
>> tpstats_2011-10-05_15:20:01:ReadRepairStage                   0
>>  2       45508253         0                 0
>> tpstats_2011-10-05_15:25:01:ReadRepairStage                   0
>>  0       45570375         0                 0
>> tpstats_2011-10-05_15:30:01:ReadRepairStage                   0
>>  0       45628426         0                 0
>> tpstats_2011-10-05_15:35:01:ReadRepairStage                   0
>>  0       45688694         0                 0
>> tpstats_2011-10-05_15:40:01:ReadRepairStage                   0
>>  3       45743029         0                 0
>> tpstats_2011-10-05_15:45:02:ReadRepairStage                   0
>>  0       45801167         0                 0
>> tpstats_2011-10-05_15:50:02:ReadRepairStage                   0
>>  0       45837329         0                 0
>> tpstats_2011-10-05_15:55:01:ReadRepairStage                   0
>>  0       45890326         0                 0
>> tpstats_2011-10-05_16:00:01:ReadRepairStage                   0
>>  0       45951703         0                 0
>> tpstats_2011-10-05_16:05:02:ReadRepairStage                   0
>>  0       46010736         0                 0
>> tpstats_2011-10-05_16:10:01:ReadRepairStage                   0
>>  0       46063294         0                 0
>> tpstats_2011-10-05_16:15:01:ReadRepairStage                   0
>>  0       46108327         0                 0
>> tpstats_2011-10-05_16:20:01:ReadRepairStage                   0
>>  0       46142291         0                 0
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Mime
View raw message