incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Héctor Izquierdo Seliva <izquie...@strands.com>
Subject Re: Corrupted data
Date Sun, 10 Jul 2011 06:53:16 GMT
All the important stuff is using QUORUM. Normal operation uses around
3-4 GB of heap out of 6. I've also tried running repair on a per CF
basis, and still no luck. I've found it's faster to bootstrap a node
again than repairing it.

Once I have the cluster in a sane state I'll try running a repair as
part of normal operation and see if manages to finish.

Btw, we are not using super columns.

Thanks for the tips

El sáb, 09-07-2011 a las 17:57 -0700, aaron morton escribió:
> > Nop, only when something breaks
> Unless you've been working at QUORUM life is about to get trickier.  Repair is an essential
part of running a cassandra cluster, without it you risk data loss and dead data coming back
to life. 
> 
> If you have been writing at QUORUM, so have a reasonable expectation of data replication,
the normal approach is to happily let scrub skip the rows, after scrub has completed a repair
will see the data repaired using one of the other replicas. That's probably already happened
as the scrub process skipped the rows when writing them out to the new files. 
> 
> Try to run repair. Try running it on a single CF to start with.
> 
> 
> Good luck
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 9 Jul 2011, at 16:45, Héctor Izquierdo Seliva wrote:
> 
> > Hi Peter.
> > 
> > I have a problem with repair, and it's that it always brings the node
> > doing the repairs down. I've tried setting index_interval to 5000, and
> > it still dies with OutOfMemory errors, or even worse, it generates
> > thousands of tiny sstables before dying.
> > 
> > I've tried like 20 repairs during this week. None of them finished. This
> > is on a 16GB machine using 12GB heap so it doesn't crash (too early).
> > 
> > 
> > El sáb, 09-07-2011 a las 16:16 +0200, Peter Schuller escribió:
> >>>> - Have you been running repair consistently ?
> >>> 
> >>> Nop, only when something breaks
> >> 
> >> This is unrelated to the problem you were asking about, but if you
> >> never run delete, make sure you are aware of:
> >> 
> >> http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
> >> http://wiki.apache.org/cassandra/DistributedDeletes
> >> 
> >> 
> > 
> > 
> 



Mime
View raw message