One of our nodes is getting an increasing number of pending compactions due, we think, to
, which is fixed in future version 2.0.11 . (We had the same error a month ago, but at that time we were in pre-production and could just clean the disks on all the nodes and restart. Now we want to be cleverer.)
To overcome the issue we figure we should
just rebuild the node using the same token range, to avoid unneeded data reshuffling. So we figure we should (1) find the tokens in use on that node via "nodetool ring", (2) stop cassandra on that node, (3) delete the data
directory, (4) Use the tokens saved in step (1) as the initial_token list, and (5) restart the node.
But the node is a seed node and cassandra won't bootstrap seed nodes. Perhaps removing that node's address from the seeds list on the other nodes (and on that node) will be sufficient. That's what
Replacing a Dead Seed Node suggests. Perhaps I can remove the ip address from the seeds list on all nodes in the cluster, restart all the nodes, and then restart the bad node with auto_bootstrap=true.
I want to use the same IP address. and so I don't think I can follow the instructions at
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html, because it assumes the IP address
of the dead node and the new node differ.
If I just start it up it will start serving traffic and read requests will fail. It wouldn't be the end of the world (the production use isn't critical yet).
Should we use "nodetool rebuild $LOCAL_DC"? (though I think that's mostly for adding a data center) Should I add it back in and do "nodetool repair"? I'm afraid that would be too slow.
Again, don't want to REMOVE the node from the cluster: that would cause reshuffling of token ranges and data. I want to use the same token range.