cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shankarpnsn <shankarp...@gmail.com>
Subject Re: What does ReadRepair exactly do?
Date Thu, 25 Oct 2012 15:45:38 GMT
aaron morton wrote
>> 2. You do a write operation (W1) with quorom of val=2
>> node1 = val1 node2 = val2 node3 = val1  (write val2 is not complete yet)
> If the write has not completed then it is not a successful write at the
> specified CL as it could fail now.
> 
> Therefor the R +W > N Strong Consistency guarantee does not apply at this
> exact point in time. A read to the cluster at this exact point in time
> using QUOURM may return val2 or val1. Again the operation W1 has not
> completed, if read R' starts and completes while W1 is processing it may
> or may not return the result of W1.

I agree completely that it is fair to have this indeterminism in case of
partial/failed/in-flight writes, based on what nodes respond to a subsequent
read. 


aaron morton wrote
> It's import to point out the difference between Read Repair, in the
> context of the read_repair_chance setting, and Consistent Reads in the
> context of the CL setting. All of this is outside of the processing of
> your read request. It is separate from the stuff below.
> 
> Inside the user read request when ReadCallback.get() is called and CL
> nodes have responded the responses are compared. If a DigestMismatch
> happens then a Row Repair read is started, the result of this read is
> returned to the user. This Row Repair read MAY detect differences, if it
> does it resolves the super set, sends the delta to the replicas and
> returns the super set value to be returned to the client. 
> 
>> In this case, for read R1, the value val2 does not have a quorum. Would
>> read
>> R1 return val2 or val4 ? 
> 
> If val4 is in the memtable on node before the second read the result will
> be val4.  
> Writes that happen between the initial read and the second read after a
> Digest Mismatch are included in the read result.

Thanks for clarifying this, Aaron. This is very much in line with what I
figured out from the code and brings me back to my initial question on the
point of when and what the user/client gets to see as the read result. Let
us, for now, consider only the repairs initiated as a part of /consistent
reads/. If the Row Repair (after resolving and sending the deltas to
replicas, but not waiting for a quorum success after the repair) returns the
super set value immediately to the user, wouldn't it be a breach of the
consistent reads paradigm? My intuition behind saying this is because we
would respond to the client without the replicas having confirmed their
meeting the consistency requirement.

I agree that returning val4 is the right thing to do if quorum (two) nodes
among (node1,node2,node3) have the val4 at the second read after digest
mismatch. But wouldn't it be incorrect to respond to user with any value
when the second read (after mismatch) doesn't find a quorum. So after
sending the deltas to the replicas as a part of the repair (still a part of
/consistent reads/), shouldn't the value be read again to check for the
presence of a quorum after the repair?  

In the example we had, assume the mismatch is detected during a read R1 from
coordinator node C, that reaches node1, node2
State seen by C after first read R1:  <node1 = val1, node2 = val 2, node3 =
val1>

A second read is initiated as a part of repair for consistent read of R1.
This second read observes the values (val1, val2) from (node1, node2) and
sends the corresponding row repair delta to node1. I'm guessing C cannot
respond back to user with val2 until C knows that node1 has actually written
the value val2 thereby meeting the quorum. Is this interpretation correct ?






--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/What-does-ReadRepair-exactly-do-tp7583261p7583395.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Mime
View raw message