cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Cassandra Wiki] Update of "Operations" by JonHaddad
Date Mon, 29 Sep 2014 02:16:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonHaddad:

  When the !RandomPartitioner is used, Tokens are integers from 0 to 2**127.  Keys are converted
to this range by MD5 hashing for comparison with Tokens.  (Thus, keys are always convertible
to Tokens, but the reverse is not always true.)
  === Token selection ===
  Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread
across the Token space, but you can still have imbalances if your Tokens do not divide up
the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127
/ N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`.
  With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independently.
Tokens still needed to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the
3rd, and so on.  Thus, for a 4-node cluster in 2 datacenters, you would have
@@ -45, +46 @@

  With order preserving partitioners, your key distribution will be application-dependent.
 You should still take your best guess at specifying initial tokens (guided by sampling actual
data, if possible), but you will be more dependent on active load balancing (see below) and/or
adding new nodes to hot spots.
  Once data is placed on the cluster, the partitioner may not be changed without wiping and
starting over.
+ As a caveat to the above section, it is generally not necessary to manually select individual
tokens when using the vnodes feature.
  === Replication ===
  A Cassandra cluster always divides up the key space into ranges delimited by Tokens as described
above, but additional replica placement is customizable via IReplicaPlacementStrategy in the
configuration file.  The standard strategies are

View raw message