Decommissioning those nodes isn't a problem.  When you say remove all the data, I assume you mean rm -rf my data directory (the default /var/lib/cassandra/data

I'd done this prior to starting up the nodes, because they were installed with from the apt-get repo, which automatically starts cassandra (bad form, on that, as a side not).  But the first time I tried to start the node after setting my config, I got an error that System.Users didn't exist and exited out.  Second time I tried to start the nodes, they started.

Outside of clearing the data, having no value for initial token, having num_tokens set, is there anything else I need to do to bring them in and bootstrap them?  

Note, this isn't the first nodes that I've added to the cluster, but they are giving me fits.  Additionally, this morning, my seed nodes were all flipping out with an error like: 
https://gist.github.com/dmcnelis/5467636  (AssertionError, when trying to determine ranges for nodes)

Once I decommissioned the new nodes, I had no more errors in my seed node logs.


On Fri, Apr 26, 2013 at 5:48 AM, Sam Overton <sam@acunu.com> wrote:
Some extra information you could provide which will help debug this: the logs from those 3 nodes which have no data and the output of "nodetool ring"

Before seeing those I can only guess, but my guess would be that in the logs on those 3 nodes you will see this: "Calculating new tokens" and this: "Split previous range (blah, blah] into <long list of tokens>"

If that is the case then it means you accidentally started those three nodes with the default configuration (single-token) and then subsequently changed (num_tokens) and then joined them into the cluster. What happens when you do this is that the node thinks it used to be responsible for a single range and is being migrated to vnodes, so it splits its single range (now a very small part of the keyspace) into 256 smaller ranges, and ends up with just a tiny portion of the ring assigned to it.

To fix this you'll need to decommission those 3 nodes, remove all data from them, then bootstrap them in again with the correct configuration from the start.

Sam



On 26 April 2013 06:07, David McNelis <dmcnelis@gmail.com> wrote:
So, I had 7 nodes that I set up using vnodes, 256 tokens each, no problem.

I added two 512 token nodes, no problem, things seemed to balance.

The next 3 nodes I added, all at 256 tokens, and they have a cumulative load of 116mb (where as the other nodes are at ~100GB and ~200GB (256 and 512 respectively).  

Anyone else seen this is 1.2.4?

The nodes seem to join the cluster ok, and I have num_tokens set and have tried both an empty initial_token and a commented out initial token, with no change.

I see nothing streaming with netstats either, though these nodes were added days apart.  At first I thought I must have a hot key or something, but that doesn't seem to be the case, since the node I thought that one was on has evened out over the past couple of days with no new nodes added.

I really *DON'T* want to deal with another shuffle....but what options do I have, since vnodes "make it unneeded to balance the cluster"?  (which, at the moment, seems like a load of bullshit).



--
Sam Overton
Acunu | http://www.acunu.com | @acunu