incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jivko donev <>
Subject Adding a node to cluster keeping 100% data replicated on all nodes
Date Fri, 07 Feb 2014 12:39:16 GMT

Our environment will consist of cluster with size not bigger than 2 to 4 nodes per cluster(all
located in the same DC). We want to ensure that every node in the cluster will own 100% of
the data. A node adding(or removing) procedure will be automated so we want to ensure we're
making the right steps. Lets say we have node 'A' up and running and want to add another node
'B' to make a cluster. Node A configuration will be: 
seed: "IP of A"
listen_address: "IP of A"
num_tokens: 256
The keyspace uses SimpleStrategy with RF: 1.

Adding node 'B' to cluster we are doing the following:
1. Stop cassandra on B.
2. Update cassandra.yaml - change seed to point to "IP of A"
3. Update - add node A ip to it and make it the default one.
4. rm -rf /var/lib/cassandra/*
5. Start cassandra on B.
6. Wait untill nodetool status reports the node B is up.
7. Update RP of the keyspace to 2.
8. Run nodetool repair on B and wait it to finish.

Can we update the RF factor on A before starting Cassandra on B in order to skip steps 7 and

Now when the data is sync on both nodes we want to make a node B a seed node.
9. Update seed property on A and B to include the the IP of B node.
10. Restart cassandra on both nodes.

If adding more nodes to the cluster the steps will be the same except that seed property will
contain all existing nodes in the cluster.

So are these steps everything we need to do? 
Is there anything more we need to do?
Is there an easier way to do what we want or all the steps above are mandatory?

View raw message