cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thakrar, Jayesh" <>
Subject Re: repair performance
Date Sat, 18 Mar 2017 16:54:20 GMT
You changed compaction_throughput_mb_per_sec, but did you also increase concurrent_compactors?

In reference to the reaper and some other info I received on the user forum to my question
on "nodetool repair", here are some useful links/slides -

From: Roland Otta <>
Date: Friday, March 17, 2017 at 5:47 PM
To: "" <>
Subject: Re: repair performance

did not recognize that so far.

thank you for the hint. i will definitely give it a try

On Fri, 2017-03-17 at 22:32 +0100, benjamin roth wrote:
The fork from thelastpickle is. I'd recommend to give it a try over pure nodetool.

2017-03-17 22:30 GMT+01:00 Roland Otta <<>>:

forgot to mention the version we are using:

we are using 3.0.7 - so i guess we should have incremental repairs by default.
it also prints out incremental:true when starting a repair
INFO  [Thread-7281] 2017-03-17 09:40:32,059 - Starting repair command
#7, repairing keyspace xxx with repair options (parallelism: parallel, primary range: false,
incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [ProdDC2], hosts: [],
# of ranges: 1758)

3.0.7 is also the reason why we are not using reaper ... as far as i could figure out it's
not compatible with 3.0+

On Fri, 2017-03-17 at 22:13 +0100, benjamin roth wrote:
It depends a lot ...

- Repairs can be very slow, yes! (And unreliable, due to timeouts, outages, whatever)
- You can use incremental repairs to speed things up for regular repairs
- You can use "reaper" to schedule repairs and run them sliced, automated, failsafe

The time repairs actually may vary a lot depending on how much data has to be streamed or
how inconsistent your cluster is.

50mbit/s is really a bit low! The actual performance depends on so many factors like your
CPU, RAM, HD/SSD, concurrency settings, load of the "old nodes" of the cluster.
This is a quite individual problem you have to track down individually.

2017-03-17 22:07 GMT+01:00 Roland Otta <<>>:


we are quite inexperienced with cassandra at the moment and are playing
around with a new cluster we built up for getting familiar with
cassandra and its possibilites.

while getting familiar with that topic we recognized that repairs in
our cluster take a long time. To get an idea of our current setup here
are some numbers:

our cluster currently consists of 4 nodes (replication factor 3).
these nodes are all on dedicated physical hardware in our own
datacenter. all of the nodes have

32 cores @2,9Ghz
64 GB ram
2 ssds (raid0) 900 GB each for data
1 seperate hdd for OS + commitlogs

current dataset:
approx 530 GB per node
21 tables (biggest one has more than 200 GB / node)

i already tried setting compactionthroughput + streamingthroughput to
unlimited for testing purposes ... but that did not change anything.

when checking system resources i cannot see any bottleneck (cpus are
pretty idle and we have no iowaits).

when issuing a repair via

nodetool repair -local on a node the repair takes longer than a day.
is this normal or could we normally expect a faster repair?

i also recognized that initalizing of new nodes in the datacenter was
really slow (approx 50 mbit/s). also here i expected a much better
performance - could those 2 problems be somehow related?


View raw message