incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <>
Subject Re: Repair taking a long, long time
Date Wed, 20 Jul 2011 14:31:51 GMT
As I indicated below (but didn't say specifically) another option is to set
read repair chance to 1.0 for all your CFs and loop over all your data,
since read triggers a read repair.

On Wed, Jul 20, 2011 at 4:58 PM, Maxim Potekhin <> wrote:

> **
> I can re-load all data that I have in the cluster, from a flat-file cache I
> have
> on NFS, many times faster than the nodetool repair takes. And that's not
> even accurate because as other noted nodetool repair eats up disk space
> for breakfast and takes more than 24hrs on 200GB data load, at which point
> I have to cancel. That's not acceptable. I simply don't know what to do
> now.
> On 7/20/2011 8:47 AM, David Boxenhorn wrote:
> I have this problem too, and I don't understand why.
> I can repair my nodes very quickly by looping though all my data (when you
> read your data it does read-repair), but nodetool repair takes forever. I
> understand that nodetool repair builds merkle trees, etc. etc., so it's a
> different algorithm, but why can't nodetool repair be smart enough to choose
> the best algorithm? Also, I don't understand what's special about my data
> that makes nodetool repair so much slower than looping through all my data.
> On Wed, Jul 20, 2011 at 12:18 AM, Maxim Potekhin <> wrote:
>> Thanks Edward. I'm told by our IT that the switch connecting the nodes is
>> pretty fast.
>> Seriously, in my house I copy complete DVD images from my bedroom to
>> the living room downstairs via WiFi, and a dozen of GB does not seem like
>> a
>> problem, on dirt cheap hardware (Patriot Box Office).
>> I also have just _one_ column major family but caveat emptor -- 8 indexes
>> attached to
>> it (and there will be more). There is one accounting CF which is small,
>> can't possibly
>> make a difference.
>> By contrast, compaction (as in nodetool) performs quite well on this
>> cluster. I start suspecting some
>> sort of malfunction.
>> Looked at the system log during the "repair", there is some compaction
>> agent doing
>> work that I'm not sure makes sense (and I didn't call for it). Disk
>> utilization all of a sudden goes up to 40%
>> per Ganglia, and stays there, this is pretty silly considering the cluster
>> is IDLE and we have SSDs. No external writes,
>> no reads. There are occasional GC stoppages, but these I can live with.
>> This repair debacle happens 2nd time in a row. Cr@p. I need to go to
>> production soon
>> and that doesn't look good at all. If I can't manage a system that simple
>> (and/or get help
>> on this list) I may have to cut losses i.e. stay with Oracle.
>> Regards,
>> Maxim
>> On 7/19/2011 12:16 PM, Edward Capriolo wrote:
>>> Well most SSD's are pretty fast. There is one more to consider. If
>>> Cassandra determines nodes are out of sync it has to transfer data across
>>> the network. If that is the case you have to look at 'nodetool streams' and
>>> determine how much data is being transferred between nodes. There are some
>>> open tickets where with larger tables repair is streaming more then it needs
>>> to. But even if the transfers are only 10% of your 200GB. Transferring 20 GB
>>> is not trivial.
>>> If you have multiple keyspaces and column families repair one at a time
>>> might make the process more manageable.

View raw message