cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Smith <Donald.Sm...@audiencescience.com>
Subject Rebuilding a cassandra seed node with the same tokens and same IP address
Date Sat, 30 Aug 2014 02:09:40 GMT
One of our nodes is getting an increasing number of pending compactions due, we think, to

https://issues.apache.org/jira/browse/CASSANDRA-7145 , which is fixed in future version 2.0.11
.   (We had the same error a month ago, but at that time we were in pre-production and could
just clean the disks on all the nodes and restart. Now we want to be cleverer.)


To overcome the issue we figure we should just rebuild the node using the same token range,
to avoid unneeded data reshuffling.  So we figure we should  (1) find the tokens in use on
that node via "nodetool ring", (2) stop cassandra on that node, (3) delete the data directory,
(4) Use the tokens saved in step (1) as the initial_token list, and (5) restart the node.


But the node is a seed node and cassandra won't bootstrap seed nodes. Perhaps removing that
node's address from the seeds list on the other nodes (and on that node) will be sufficient.
That's what Replacing a Dead Seed Node<http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html>
suggests. Perhaps I can remove the ip address from the seeds list on all nodes in the cluster,
restart all the nodes, and then restart the bad node with auto_bootstrap=true.


I want to use the same IP address. and so I don't think I can follow the instructions at

http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html,
because it assumes the IP address of the dead node and the new node differ.


If I just start it up  it will start serving traffic and read requests will fail. It wouldn't
be the end of the world (the production use isn't critical yet).


Should we use "nodetool rebuild $LOCAL_DC"?  (though I think that's mostly for adding a data
center) Should I add it back in and do "nodetool repair"? I'm afraid that would be too slow.


Again, don't want to REMOVE the node from the cluster: that would cause reshuffling of token
ranges and data. I want to use the same token range.


Any suggestions?


Thanks, Don

Mime
View raw message