cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <>
Subject Re: Partition range incremental repairs
Date Tue, 06 Jun 2017 15:08:16 GMT
Hi Chris,
Using pr with incremental repairs does not make sense. Primary range repair is an optimization
over full repair. If you run full repair on a n node cluster with RF=3, you would be repairing
each data thrice. E.g. in a 5 node cluster with RF=3, a range may exist on node A,B and C
. When full repair is run on node A, the entire data in that range gets synced with replicas
on node B and C. Now, when you run full repair on nodes B and C, you are wasting resources
on repairing data which is already repaired. 
Primary range repair ensures that when you run repair on a node, it ONLY repairs the data
which is owned by the node. Thus, no node repairs data which is not owned by it and must be
repaired by other node. Redundant work is eliminated. 
Even in pr, each time you run pr on all nodes, you repair 100% of data. Why to repair complete
data in each cycle?? ..even data which has not even changed since the last repair cycle?
This is where Incremental repair comes as an improvement. Once repaired, a data would be marked
repaired so that the next repair cycle could just focus on repairing the delta. Now, lets
go back to the example of 5 node cluster with RF =3.This time we run incremental repair on
all nodes. When you repair entire data on node A, all 3 replicas are marked as repaired. Even
if you run inc repair on all ranges on the second node, you would not re-repair the already
repaired data. Thus, there is no advantage of repairing only the data owned by the node (primary
range of the node). You can run inc repair on all the data present on a node and Cassandra
would make sure that when you repair data on other nodes, you only repair unrepaired data.

Sent from Yahoo Mail on Android 
  On Tue, Jun 6, 2017 at 4:27 PM, Chris Stokesmore<> wrote:
  Hi all,

Wondering if anyone had any thoughts on this? At the moment the long running repairs cause
us to be running them on two nodes at once for a bit of time, which obivould increases the
cluster load.

On 2017-05-25 16:18 (+0100), Chris Stokesmore <> wrote: 
> Hi,> 
> We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been running repairs with
the -pr option, via a cron job that runs on each node once per week.> 
> We changed that as some advice on the Cassandra IRC channel said it would cause more
anticompaction and
 says 'Performing partitioner range repairs by using the -pr option is generally considered
a good choice for doing manual repairs. However, this option cannot be used with incremental
repairs (default for Cassandra 2.2 and later)'
> Only problem is our -pr repairs were taking about 8 hours, and now the non-pr repair
are taking 24+ - I guess this makes sense, repairing 1/7 of data increased to 3/7, except
I was hoping to see a speed up after the first loop through the cluster as each repair will
be marking much more data as repaired, right?> 
> Is running -pr with incremental repairs really that bad? > 
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message