cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: eliminate need to repair by using column TTL??
Date Fri, 22 Jul 2011 09:43:54 GMT
Read repair will only repair data that is read on the nodes that are up at that time, and does
not guarantee that any changes it detects will be written back to the nodes. The diff mutations
are async fire and forget messages which may go missing or be dropped or ignored by the recipient
just like any other message. 

Also getting hit with a bunch of read repair operations is pretty painful. The normal read
runs, the coordinator detects the digest mis-match, the read runs again from all nodes and
they all have to return their full data (no digests this time), the coordinator detects the
diffs, mutations are sent back to each node that needs them. All this happens sync to the
read request when the CL > ONE. Thats 2 reads with more network IO and up to RF mutations

The delete thing is important but repair also reduces the chance of reads getting hit with
RR and gives me confidence when it's necessary to nuke a bad node. 

Your plan may work but it feels risky to me. You may end up with worse read performance and
unpleasent emotions if you ever have to nuke a node. Others may disagree. 

Not ignoring the fact the repair can take a long time, fail, hurt performance etc. There are
plans to improve it though. 

Aaron Morton
Freelance Cassandra Developer

On 22 Jul 2011, at 19:55, wrote:

> One of the main reasons for regularly running repair is to make sure deletes are propagated
in the cluster, i.e., data is not resurrected if a node never received the delete call.
> And repair-on-read takes care of repairing inconsistencies "on-the-fly".
> So if I were to set a universal TTL on all columns - so everything would only live for
a certain age, would I be able to get away without having to do regular repairs with nodetool?
> I realize this scenario would not be applicable for everyone, but our data model would
allow us to do this. 
> So could this be an alternative to running the (resource-intensive, long-running) repairs
with nodetool?
> Thanks.

View raw message