cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Crosato <paolo.cros...@targaubiest.com>
Subject Best practices for repair
Date Thu, 19 Jun 2014 14:13:00 GMT
Hi eveybody,

we have some problems running repairs on a timely schedule. We have a 
three node deployment, and we start repair on one node every week, 
repairing one columnfamily by one.
However, when we run into the big column families, usually repair 
sessions hangs undefinitely, and we have to restart them manually.

The script runs commands like:

nodetool repair keyspace columnfamily

one by one.

This has not been a major issue for some time, since we never delete 
data, however we would like to sort the issue once and for all.

Reading resources on the net, I came to the conclusion that we could:

1) either run a repair sessione like the one above, but with the -pr 
switch, and run it on every node, not just on one
2) or run sub range repair as described here 
http://www.datastax.com/dev/blog/advanced-repair-techniques , which 
would be the best option.
However the latter procedure would require us to write some java program 
that calls describe_splits to get the tokens to feed nodetool repair with.

The second procedure is available out of the box only in the commercial 
version of the opscenter, is this true?

I would like to know if these are the current best practices for repairs 
or if there is some other option that makes repair easier to perform, 
and more
reliable that it is now.

Regards,

Paolo Crosato

-- 
Paolo Crosato
Software engineer/Custom Solutions
e-mail: paolo.crosato@targaubiest.com


Mime
View raw message