cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manu Zhang <>
Subject Re: What does ReadRepair exactly do?
Date Thu, 25 Oct 2012 16:37:41 GMT
read quorum doesn't mean we read newest values from a quorum number of
replicas but to ensure we read at least one newest value as long as write
quorum succeeded beforehand and W+R > N.

On Fri, Oct 26, 2012 at 12:00 AM, Hiller, Dean <> wrote:

> Kind of an interesting question
> I think you are saying if a client read resolved only the two nodes as
> said in Aaron's email back to the client and read -repair was kicked off
> because of the inconsistent values and the write did not complete yet and
> I guess you would have two nodes go down to lose the value right after the
> read, and before write was finished such that the client read a value that
> was never stored in the database.  The odds of two nodes going out are
> pretty slim though.
> Or, what if the node with part of the write went down, as long as the
> client stays up, he would complete his write on the other two nodes.
> Seems to me as long as two nodes don't fail, you are reading at quorum and
> fit with the consistency model since you get a value that will be on two
> nodes in the immediate future.
> Thanks,
> Dean
> On 10/25/12 9:45 AM, "shankarpnsn" <> wrote:
> >aaron morton wrote
> >>> 2. You do a write operation (W1) with quorom of val=2
> >>> node1 = val1 node2 = val2 node3 = val1  (write val2 is not complete
> >>>yet)
> >> If the write has not completed then it is not a successful write at the
> >> specified CL as it could fail now.
> >>
> >> Therefor the R +W > N Strong Consistency guarantee does not apply at
> >>this
> >> exact point in time. A read to the cluster at this exact point in time
> >> using QUOURM may return val2 or val1. Again the operation W1 has not
> >> completed, if read R' starts and completes while W1 is processing it may
> >> or may not return the result of W1.
> >
> >I agree completely that it is fair to have this indeterminism in case of
> >partial/failed/in-flight writes, based on what nodes respond to a
> >subsequent
> >read.
> >
> >
> >aaron morton wrote
> >> It's import to point out the difference between Read Repair, in the
> >> context of the read_repair_chance setting, and Consistent Reads in the
> >> context of the CL setting. All of this is outside of the processing of
> >> your read request. It is separate from the stuff below.
> >>
> >> Inside the user read request when ReadCallback.get() is called and CL
> >> nodes have responded the responses are compared. If a DigestMismatch
> >> happens then a Row Repair read is started, the result of this read is
> >> returned to the user. This Row Repair read MAY detect differences, if it
> >> does it resolves the super set, sends the delta to the replicas and
> >> returns the super set value to be returned to the client.
> >>
> >>> In this case, for read R1, the value val2 does not have a quorum. Would
> >>> read
> >>> R1 return val2 or val4 ?
> >>
> >> If val4 is in the memtable on node before the second read the result
> >>will
> >> be val4.
> >> Writes that happen between the initial read and the second read after a
> >> Digest Mismatch are included in the read result.
> >
> >Thanks for clarifying this, Aaron. This is very much in line with what I
> >figured out from the code and brings me back to my initial question on the
> >point of when and what the user/client gets to see as the read result. Let
> >us, for now, consider only the repairs initiated as a part of /consistent
> >reads/. If the Row Repair (after resolving and sending the deltas to
> >replicas, but not waiting for a quorum success after the repair) returns
> >the
> >super set value immediately to the user, wouldn't it be a breach of the
> >consistent reads paradigm? My intuition behind saying this is because we
> >would respond to the client without the replicas having confirmed their
> >meeting the consistency requirement.
> >
> >I agree that returning val4 is the right thing to do if quorum (two) nodes
> >among (node1,node2,node3) have the val4 at the second read after digest
> >mismatch. But wouldn't it be incorrect to respond to user with any value
> >when the second read (after mismatch) doesn't find a quorum. So after
> >sending the deltas to the replicas as a part of the repair (still a part
> >of
> >/consistent reads/), shouldn't the value be read again to check for the
> >presence of a quorum after the repair?
> >
> >In the example we had, assume the mismatch is detected during a read R1
> >from
> >coordinator node C, that reaches node1, node2
> >State seen by C after first read R1:  <node1 = val1, node2 = val 2, node3
> >=
> >val1>
> >
> >A second read is initiated as a part of repair for consistent read of R1.
> >This second read observes the values (val1, val2) from (node1, node2) and
> >sends the corresponding row repair delta to node1. I'm guessing C cannot
> >respond back to user with val2 until C knows that node1 has actually
> >written
> >the value val2 thereby meeting the quorum. Is this interpretation correct
> >?
> >
> >
> >
> >
> >
> >
> >--
> >View this message in context:
> >
> >-ReadRepair-exactly-do-tp7583261p7583395.html
> >Sent from the mailing list archive at
> >

View raw message