cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "Operations" by JonathanEllis
Date Sun, 06 Mar 2011 00:50:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonathanEllis.
The comment on this change is: update repair section for 0.7.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=82&rev2=83

--------------------------------------------------

  Cassandra repairs data in two ways:
  
   1. Read Repair: every time a read is performed, Cassandra compares the versions at each
replica (in the background, if a low consistency was requested by the reader to minimize latency),
and the newest version is sent to any out-of-date replicas.
-  1. Anti-Entropy: when `nodetool repair` is run, Cassandra computes a Merkle tree of the
data on that node, and compares it with the versions on other replicas, to catch any out of
sync data that hasn't been read recently.  This is intended to be run infrequently (e.g.,
weekly) since computing the Merkle tree is relatively expensive in disk i/o and CPU, since
it scans ALL the data on the machine (but it is is very network efficient).  
+  1. Anti-Entropy: when `nodetool repair` is run, Cassandra computes a Merkle tree for each
range of data on that node, and compares it with the versions on other replicas, to catch
any out of sync data that hasn't been read recently.  This is intended to be run infrequently
(e.g., weekly) since computing the Merkle tree is relatively expensive in disk i/o and CPU,
since it scans ALL the data on the machine (but it is is very network efficient).  
  
  Running `nodetool repair`:
- Like all nodetool operations, repair is non-blocking; it sends the command to the given
node, but does not wait for the repair to actually finish.  You can tell that repair is finished
when (a) there are no active or pending tasks in the CompactionManager, and after that when
(b) there are no active or pending tasks on o.a.c.concurrent.AE-SERVICE-STAGE, or o.a.c.service.StreamingService.
+ Like all nodetool operations in 0.7, repair is blocking: it will wait for the repair to
finish and then exit.  This may take a long time on large data sets.
  
- Repair should be run against one machine at a time.  (This limitation will be fixed in 0.7.)
+ It is safe to run repair against multiple machines at the same time, but to minimize the
impact on your application workload it is recommended to wait for it to complete on one node
before invoking it against the next.
  
  === Frequency of nodetool repair ===
  

Mime
View raw message