cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "Operations" by JonathanEllis
Date Wed, 09 Dec 2009 03:41:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=7&rev2=8

--------------------------------------------------

  
  Note that with !RackAwareStrategy, succeeding nodes along the ring should alternate data
centers to avoid hot spots.  For instance, if you have nodes A, B, C, and D in increasing
Token order, and instead of alternating you place A and B in DC1, and C and D in DC2, then
nodes C and A will have disproportionately more data on them because they will be the replica
destination for every Token range in the other data center.
  
- Replication strategy may not be changed without wiping your data and starting over.
+ Replication strategy is not intended to be changed after loading data, but it can be done
if you need to badly enough. The procedure would look something like:
+  1. have each node do an anticompaction for its primary range
+  1. manually scp those to the new replica points
+  1. then switch the partitioner
+ 
+ This could be done offline, or online at the cost of introducing some temporary inconsistency
that could be fixed by repair (see below).
  
  = Adding new nodes =
  Adding new nodes is called "bootstrapping."
@@ -65, +70 @@

   1. Remove the old node from the ring first, or bring up a replacement node with the same
IP and Token as the old; otherwise, the old node will stay part of the ring in a "down" state,
which will degrade your replication factor for the affected Range
    * If you don't know the Token of the old node, you can retrieve it from any of the other
nodes' `system` keyspace, !ColumnFamily `LocationInfo`, key `L`.
    * You can also run  `nodeprobe ring `to lookup a node's token (Unless there was some kind
of outage, and the others came up but not the down one).
-  1. Removing the old node, then bootstrapping the new one, may be more performant than using
Anti-Entropy.  Testing needed.
+  1. Removing the old node, then bootstrapping the new one, may be more performant than using
Anti-Entropy (testing needed), and will eliminate incorrect answers given by the replacement
node while it does not yet have all the data for its Range.
-   * Even brute-force rsyncing of data from the relevant replicas and running cleanup on
the replacement node may be more performant
+   * To test: even brute-force rsyncing of data from the relevant replicas and running cleanup
on the replacement node may be more performant.
  
  = Backing up data =
  Cassandra can snapshot data while online using `nodeprobe snapshot`.  You can then back
up those snapshots using any desired system, although leaving them where they are is probably
the option that makes the most sense on large clusters.

Mime
View raw message