cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Blake Eggleston (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9143) Improving consistency of repairAt field across replicas
Date Tue, 08 Nov 2016 23:27:58 GMT


Blake Eggleston commented on CASSANDRA-9143:

| [trunk|] | [dtest|]
| [testall|]
| [3.0|] | [dtest|]
| [testall|]|

[dtest branch|]

I've tried to break this up into logical commits for each component of the change to make
reviewing easier.

The new incremental repair would work as follows:
# persist session locally on each repair participant
# anti-compact all unrepaired sstables intersecting with the range being repaired into a pending
repair bucket
# perform validation/sync against the sstables segregated in the pending anti compaction step
# perform 2PC to promote pending repair sstables into repaired
#* If this, or the validation/sync phase fails, the sstables are moved back into unrepaired

Since incremental repair is the default in 3.0, I've also included a patch which fixes the
consistency problems in 3.0, and is backwards compatible with the existing repair. That said,
I'm not really convinced that making a change like this to repair in 3.0.x is a great idea.

I'd be more in favor of disabling incremental repair, or at least not making it the default
in 3.0.x. The compaction that gets kicked off after streamed sstables are added to the cfs
means that whether repaired data is ultimately placed in the repaired or unrepaired bucket
by anti-compaction is basically a crapshoot.

> Improving consistency of repairAt field across replicas 
> --------------------------------------------------------
>                 Key: CASSANDRA-9143
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Blake Eggleston
>            Priority: Minor
> We currently send an anticompaction request to all replicas. During this, a node will
split stables and mark the appropriate ones repaired. 
> The problem is that this could fail on some replicas due to many reasons leading to problems
in the next repair. 
> This is what I am suggesting to improve it. 
> 1) Send anticompaction request to all replicas. This can be done at session level. 
> 2) During anticompaction, stables are split but not marked repaired. 
> 3) When we get positive ack from all replicas, coordinator will send another message
called markRepaired. 
> 4) On getting this message, replicas will mark the appropriate stables as repaired. 
> This will reduce the window of failure. We can also think of "hinting" markRepaired message
if required. 
> Also the stables which are streaming can be marked as repaired like it is done now. 

This message was sent by Atlassian JIRA

View raw message