incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: How to determine if repair need to be run
Date Thu, 31 Mar 2011 22:11:54 GMT
> Thanks a lot for elaborating on repairs.    Still, it's a bit fuzzy to me why it is
so important to run a repair before the GCGraceSeconds kicks in.   Does this mean a delete
does not get "replicated" ?   In other words when I delete something on a node, doesn't cassandra
set tombstones on its replica copies?

Deletes are replicated, but deletes are special in that unlike actual
data, you're wanting to *remove* something, but the information that
says "stuff is gone" is information in and of itself. Clearly you
don't want to forever and ever keep track of anything ever removed in
the cluster, so this has to expire somehow. For that reason, there is
a requirement that tombstones are replicated prior to their expiry.
See:

      http://wiki.apache.org/cassandra/DistributedDeletes

> And technically, isn't repair only needed for cases where things weren't properly propogated
in the cluster?  If all writes are written to the right replicas, and all deletes are written
to all the replicas, and all nodes were available at all times, then everything should work
as designed -  without manual intervention, right?

Yes, but you can assume that doesn't happen in real life for extended
periods of time. It doesn't take a lot at all for a *few* writes not
getting replicated (for example, just restarting a Cassandra node will
cause some writes to be dropped - hinted handoff is not a guarantee,
only an optimization).

-- 
/ Peter Schuller

Mime
View raw message