incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Kim <g...@netflix.com>
Subject Best practice for adding new nodes to ring
Date Tue, 26 Oct 2010 17:21:01 GMT
Hi,

I have a question regarding the best practices for adding new nodes to an existing cluster.
 From reading the following wiki: http://wiki.apache.org/cassandra/Operations  -- I understand
that when creating a brand new cluster -- we can use the following to calculate the initial
token for each node to achieve balance in the ring:  
  def tokens(nodes):
     for i in range(1, nodes + 1):
         print (i * (2 ** 127 - 1) / nodes)


My question is on the best practice for adding new nodes to an existing cluster.  There is
a recommendation in the wiki which is to basically to compute new tokens for every node and
assign them manually using the nodetool command.  We're planning on running either 16GB or
32GB heaps on each of our nodes, so token re-assignment for each node in the cluster sounds
like a very expensive operation especially in situations where we're adding new nodes to handle
scaling issues w/ the existing cluster. 

I'm bit of a noob to cassandra, so wanted to see how others are currently coping w/ this.
 One option can be to grow the cluster in the power of 2 and use bootstraping w/ automatic
token generation.  Is this an option that people are using? (but this gets exponentially expensive
when you already have a large # of nodes)

Does anyone know why cassandra doesn't use virtual tokens (e.g. one node token - creating
256 virtual node tokens in the ring)?  This way adding new nodes to an existing cluster will
significantly mitigate the unbalance issue in the ring.  


Thanks
gkim
Mime
View raw message