cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <>
Subject Re: nodetool repair caused high disk space usage
Date Sun, 21 Aug 2011 06:25:34 GMT
> Do you have an indication that at least the disk space is in fact
> consistent with the amount of data being streamed between the nodes? I
> think you had 90 -> ~ 450 gig with RF=3, right? Still sounds like a
> lot assuming repairs are not running concurrently (and compactions are
> able to run after a repair before the next repair of a neighbor
> starts).
Hi Peter,
When a repair was running on the 40GB keyspace I'd usually see range repairs
for about up to a couple thousand ranges for each CF. If range = #keys then
that's a very small amount of data being moved around.
However, at the time, I hadn't noticed that there were multiple repairs
running concurrently on the same nodes and on the neighbors so I suppose my
experience is invalid for possibly finding a bug. But I suspect it will help
someone out along the way because they'll have multiple repairs going on too
and I have a much better understanding of what's going on myself.

I've reloaded all my data in my cluster now, the load is 140GB on each node
and I've been able to run a repair on each CF that comes out almost 100%
consistent. I'm now starting to run the daily repair crons again to see if
they go out of whack or not.

View raw message