cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Potekhin <>
Subject Re: Repair taking a long, long time
Date Wed, 20 Jul 2011 13:58:58 GMT
I can re-load all data that I have in the cluster, from a flat-file 
cache I have
on NFS, many times faster than the nodetool repair takes. And that's not
even accurate because as other noted nodetool repair eats up disk space
for breakfast and takes more than 24hrs on 200GB data load, at which point
I have to cancel. That's not acceptable. I simply don't know what to do now.

On 7/20/2011 8:47 AM, David Boxenhorn wrote:
> I have this problem too, and I don't understand why.
> I can repair my nodes very quickly by looping though all my data (when 
> you read your data it does read-repair), but nodetool repair takes 
> forever. I understand that nodetool repair builds merkle trees, etc. 
> etc., so it's a different algorithm, but why can't nodetool repair be 
> smart enough to choose the best algorithm? Also, I don't understand 
> what's special about my data that makes nodetool repair so much slower 
> than looping through all my data.
> On Wed, Jul 20, 2011 at 12:18 AM, Maxim Potekhin < 
> <>> wrote:
>     Thanks Edward. I'm told by our IT that the switch connecting the
>     nodes is pretty fast.
>     Seriously, in my house I copy complete DVD images from my bedroom to
>     the living room downstairs via WiFi, and a dozen of GB does not
>     seem like a
>     problem, on dirt cheap hardware (Patriot Box Office).
>     I also have just _one_ column major family but caveat emptor -- 8
>     indexes attached to
>     it (and there will be more). There is one accounting CF which is
>     small, can't possibly
>     make a difference.
>     By contrast, compaction (as in nodetool) performs quite well on
>     this cluster. I start suspecting some
>     sort of malfunction.
>     Looked at the system log during the "repair", there is some
>     compaction agent doing
>     work that I'm not sure makes sense (and I didn't call for it).
>     Disk utilization all of a sudden goes up to 40%
>     per Ganglia, and stays there, this is pretty silly considering the
>     cluster is IDLE and we have SSDs. No external writes,
>     no reads. There are occasional GC stoppages, but these I can live
>     with.
>     This repair debacle happens 2nd time in a row. Cr@p. I need to go
>     to production soon
>     and that doesn't look good at all. If I can't manage a system that
>     simple (and/or get help
>     on this list) I may have to cut losses i.e. stay with Oracle.
>     Regards,
>     Maxim
>     On 7/19/2011 12:16 PM, Edward Capriolo wrote:
>         Well most SSD's are pretty fast. There is one more to
>         consider. If Cassandra determines nodes are out of sync it has
>         to transfer data across the network. If that is the case you
>         have to look at 'nodetool streams' and determine how much data
>         is being transferred between nodes. There are some open
>         tickets where with larger tables repair is streaming more then
>         it needs to. But even if the transfers are only 10% of your
>         200GB. Transferring 20 GB is not trivial.
>         If you have multiple keyspaces and column families repair one
>         at a time might make the process more manageable.

View raw message