cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: What does ReadRepair exactly do?
Date Sun, 21 Oct 2012 22:49:40 GMT
There are two processes in cassandra that trigger Read Repair like behaviour. 

During a DigestMismatchException is raised if the responses from the replicas do not match.
In this case another read is run that involves reading all the data. This is the CL level
agreement kicking in. 

The other "Read Repair" is the one controlled by the "read_repair_chance". When RR is active
on a request ALL up replicas are involved in the read. When RR is not active only CL replicas
are involved. When test for CL agreement occurs synchronously to the request; the RR check
waits asynchronously to the request for all nodes in the request to return. It then checks
for consistency and repairs differences. 

> From looking at the source code, I do not understand how this set is built and I do not
understand how the reconciliation is executed.
When a DigestMismatch is detected a read is run using RepairCallback. The callback will call
the RowRepairResolver.resolve() when enough responses have been collected. 

resolveSuperset() picks one response to the baseline, and then calls delete() to apply row
level deletes from the other responses (ColumnFamily's). It collects the other CF's into an
iterator with a filter that returns all columns. The columns are then applied to the baseline
CF which may result in reconcile() being called. 

reconcile() is used when a AbstractColumnContainer has two versions of a column and it wants
to only have one. 

RowRepairResolve.scheduleRepairs() works out the delta for each node by calling ColumnFamily.diff().
The delta is then sent to the appropriate node.


Hope that helps. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/10/2012, at 6:33 AM, Markus Klems <markusklems@gmail.com> wrote:

> Hi guys,
> 
> I am looking through the Cassandra source code in the github trunk to better understand
how Cassandra's fault-tolerance mechanisms work. Most things make sense. I am also aware of
the wiki and DataStax documentation. However, I do not understand what read repair does in
detail. The method RowRepairResolver.resolveSuperset(Iterable<ColumnFamily> versions)
seems to do the trick of merging conflicting versions of column family replicas and builds
the set of columns that need to be "repaired". From looking at the source code, I do not understand
how this set is built and I do not understand how the reconciliation is executed. ReadRepair
does not seem to trigger a Column.reconcile() to reconcile conflicting column versions on
different servers. Does it?
> 
> If this is not what read repair does, then: What kind of inconsistencies are resolved
by read repair? And: How are the inconsistencies resolved?
> 
> Could someone give me a hint?
> 
> Thanks so much,
> 
> -Markus


Mime
View raw message