cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "Operations" by PeterSchuller
Date Mon, 10 Jan 2011 22:24:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by PeterSchuller.
The comment on this change is: Fix wiki syntax for numbered list.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=77&rev2=78

--------------------------------------------------

  
  There are at least three ways to deal with this scenario.
  
- #1 Treat the node in question as failed, and replace it as described further below.
+  1. Treat the node in question as failed, and replace it as described further below.
- #2 To minimize the amount of forgotten deletes, first increase GCGraceSeconds across the
cluster (rolling restart required), perform a full repair on all nodes, and then change GCRaceSeconds
back again. This has the advantage of ensuring tombstones spread as much as possible, minimizing
the amount of data that may "pop back up" (forgotten delete).
+  2. To minimize the amount of forgotten deletes, first increase GCGraceSeconds across the
cluster (rolling restart required), perform a full repair on all nodes, and then change GCRaceSeconds
back again. This has the advantage of ensuring tombstones spread as much as possible, minimizing
the amount of data that may "pop back up" (forgotten delete).
- #3 Yet another option, that will result in more forgotten deletes than the previous suggestion
but is easier to do, is to ensure 'nodetool repair' has been run on all nodes, and then perform
a compaction to expire toombstones. Following this, read-repair and regular `nodetool repair`
should cause the cluster to converge.
+  3. Yet another option, that will result in more forgotten deletes than the previous suggestion
but is easier to do, is to ensure 'nodetool repair' has been run on all nodes, and then perform
a compaction to expire toombstones. Following this, read-repair and regular `nodetool repair`
should cause the cluster to converge.
  
  === Handling failure ===
  If a node goes down and comes back up, the ordinary repair mechanisms will be adequate to
deal with any inconsistent data.  Remember though that if a node misses updates and is not
repaired for longer than your configured GCGraceSeconds (default: 10 days), it could have
missed remove operations permanently.  Unless your application performs no removes, you should
wipe its data directory, re-bootstrap it, and removetoken its old entry in the ring (see below).

Mime
View raw message