Read Repair is a process that runs during a read to repair differences in the background. Itís active on (by default) 10% of the reads.
We have seen read repair take very long time even for few GBs
I assume you mean nodetool repair (aka anti entropy). It runs in two phases, first it calculates a hash of the data on the node and second it transfers data to resolve inconsistencies.
You can track the first part using nodetool compactionstats and the second with nodetool netstats.
I would guess itís the first part that is taking a while, how much CPU power do you have ? Also the first part if throttled by the compaction_throughput YAML setting.
Co-Founder & Principal Consultant
Apache Cassandra Consulting
We have seen read repair take very long time even for few GBs of data even though we don't see disk or network bottlenecks. Do you use any specific configuration to speed up read repairs?