I had a cluster of three nodes with RF=3 that I was using. Then, my demand dropped off quite a bit and I was trying to bring the cluster down to just one node for some time while working on other things to lower my server costs.
Dropping the first node off the cluster worked fine using nodetool decommission. On the second node, I forgot to decommission the node before terminating the server instance. For some reason, this caused the remaining node to stop working. So, now I have one broken node and a backup of the data from the second node.
I'd like to just bring up the one node and get it working again. It should have a copy of all the data since I never ran the cluster with more nodes than the RF.
Here's some more info on where I'm at that might help.
All the servers were running 0.6.5.
This is the output I get from nodetool ring
Address Status Load Range Ring
10.202.65.143 Up 27.13 GB 165675654950889355108929973590945588660 |<--|
I dumped the LocationInfo table and ran nodetool removetoken on anything that looked remotely like a token. Every time, nodetool produced no output. Except when I tried to remove the token given in the ring output. It, of course, told me I couldn't remove the token from the local node.
I tried rebuilding the node from scratch yesterday but got only the same results. The token shown in the ring was different, but otherwise, all output there is the same.
The more extreme option I considered today is creating a whole new node on a new server, running all the db files out to json and then importing them into the new node. Not sure that'll be any different than what I've tried, but it feels like it would be as clean as I could get.
Thanks for the followups,
On Oct 7, 2010, at 7:00 PM, Matthew Dennis wrote:
I'm confused on why removetoken doesn't do anything and would be interested in finding out why, but to answer your question:
You can shutdown down your last node, nuke the system directory (make a backup just in case), restart the node, load the schema (export it first if need be) and be one your way. You should end up with a node that is the only one in the ring. Again, make a backup of the the system directory (actually, might as well just backup the entire data and commitlog directories) before you start nuking stuff.
On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton <email@example.com>
I'm a bit confused about what you are trying to do here. You have 2 nodes with RF = ? , you lost one node completely and now you want to...
Just get a cluster running again, don't worry about the data.
Restore the data from the dead node.
Create a cluster with the data from the remaining node and a new node.
I was able to figure out to use the sstable2json tool to get the values out of the system keyspace.
Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should have a whole DB copy.
Running removetoken on any of the values that appeared to be tokens in the LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the token of the dead node? Or, is there a way to take down the last node and bring back up a new cluster using the sstables that I have on the remaining node?
On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:
> Hey all,
> I had a node go down that I'm not able to get a token for from nodetool ring.
> The wiki says:
> "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."
> But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:
> ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
> java.lang.RuntimeException: No replica strategy configured for system
> Can someone point me at a good way to get the token?
Software and Support for Apache Cassandra
m: 512.587.0900 f: 866.583.2068