Earlier today I emailed about issues we’re having bootstrapping nodes into our existing cluster. One theory we have is that our nodes are simply too large and are considering moving to more, smaller nodes. However, because we cannot bootstrap it makes it difficult. As I see it, we have two options (assuming the new cluster is already setup and running):
- Add the new cluster as another data center. I am already using NetworkTopologySnitch. The existing nodes would then stream their data over to the new cluster. Couple questions here:
- I assume its ok if data centers have different node sizes (I.e. Smaller) and more nodes?
- Is adding a new data center to a cluster basically a large bootstrap in which case its quite possible our existing bootstrap issues would present themselves? Documentation via nodetool rebuild indicates it is.
- Use SSTableLoader to bulk load data on the existing cluster to the new one. To do, I would need to do the following steps:
- Have clients start dual writes to new and old cluster (only read from old)
- Backup data on the nodes. We are using JNA so this should not result in double the data space usage, correct? I assume I can then simply ftp the hard links to another server?
- Run SSTableLoader on each of the SSTables taken from the backup to the new cluster
- When SSTableLoader has completed, new cluster will have all of the data and old cluster can be decommissioned
Thoughts? Any automated tools around the SSTableLoader option?