incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: Automated Repair on multiple nodes
Date Fri, 02 Aug 2013 22:38:28 GMT
On Fri, Aug 2, 2013 at 3:28 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:

> We currently run automated repairs sequentially on all the nodes. However,
> as we grow the cluster we now need to run repair on multiple nodes in
> parallel to be able to finish it withing gcgrace seconds.
>
Or you could just increase gc_grace_seconds from the arbitrary and IMO
unreasonably low default of 10 days.

> Before I write the script I was wondering if somebody already has a tool
> or a script that figures out nodes that we can safely run repairs on in
> parallel. For instance we wouldn't run repair on replica nodes in parallel.
>
This will only really work with non-virtual nodes, if you repair
hardware-node-wide. With 256 virtual nodes per node, your repair overhead
will also be evenly distributed.

Someone has probably written the script, but if I were you I would consider
whether you really want to monitor N/RF fragile and independent repair
sessions simultaneously before using such a script.

=Rob

Mime
View raw message