cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Cassandra Wiki] Update of "Operations" by PeterSchuller
Date Mon, 10 Jan 2011 22:09:17 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by PeterSchuller.
The comment on this change is: Add a section on periodic nodetool repair and why it is important
w.r.t. GCGraceSeconds.


  Like all nodetool operations, repair is non-blocking; it sends the command to the given
node, but does not wait for the repair to actually finish.  You can tell that repair is finished
when (a) there are no active or pending tasks in the CompactionManager, and after that when
(b) there are no active or pending tasks on o.a.c.concurrent.AE-SERVICE-STAGE, or o.a.c.service.StreamingService.
  Repair should be run against one machine at a time.  (This limitation will be fixed in 0.7.)
+ === Frequency of nodetool repair ===
+ Unless your application performs no deletes, it is vital that production clusters run `nodetool
repair` periodically on all nodes in the cluster. The hard requirement for repair frequency
is the value used for GCGraceSeconds (see [[DistributedDeletes]]). Running nodetool repair
often enough to guarantee that all nodes have performed a repair in a given period GCGraceSeconds
long, ensures that deletes are not "forgotten" in the cluster.
  === Handling failure ===
  If a node goes down and comes back up, the ordinary repair mechanisms will be adequate to
deal with any inconsistent data.  Remember though that if a node misses updates and is not
repaired for longer than your configured GCGraceSeconds (default: 10 days), it could have
missed remove operations permanently.  Unless your application performs no removes, you should
wipe its data directory, re-bootstrap it, and removetoken its old entry in the ring (see below).

View raw message