incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@gmail.com>
Subject Re: vnodes and load balancing - 1.2.4
Date Fri, 26 Apr 2013 14:34:05 GMT
Decommissioning those nodes isn't a problem.  When you say remove all the
data, I assume you mean rm -rf my data directory (the default
/var/lib/cassandra/data

I'd done this prior to starting up the nodes, because they were installed
with from the apt-get repo, which automatically starts cassandra (bad form,
on that, as a side not).  But the first time I tried to start the node
after setting my config, I got an error that System.Users didn't exist and
exited out.  Second time I tried to start the nodes, they started.

Outside of clearing the data, having no value for initial token, having
num_tokens set, is there anything else I need to do to bring them in and
bootstrap them?

Note, this isn't the first nodes that I've added to the cluster, but they
are giving me fits.  Additionally, this morning, my seed nodes were all
flipping out with an error like:
https://gist.github.com/dmcnelis/5467636  (AssertionError, when trying to
determine ranges for nodes)

Once I decommissioned the new nodes, I had no more errors in my seed node
logs.


On Fri, Apr 26, 2013 at 5:48 AM, Sam Overton <sam@acunu.com> wrote:

> Some extra information you could provide which will help debug this: the
> logs from those 3 nodes which have no data and the output of "nodetool ring"
>
> Before seeing those I can only guess, but my guess would be that in the
> logs on those 3 nodes you will see this: "Calculating new tokens" and this:
> "Split previous range (blah, blah] into <long list of tokens>"
>
> If that is the case then it means you accidentally started those three
> nodes with the default configuration (single-token) and then subsequently
> changed (num_tokens) and then joined them into the cluster. What happens
> when you do this is that the node thinks it used to be responsible for a
> single range and is being migrated to vnodes, so it splits its single range
> (now a very small part of the keyspace) into 256 smaller ranges, and ends
> up with just a tiny portion of the ring assigned to it.
>
> To fix this you'll need to decommission those 3 nodes, remove all data
> from them, then bootstrap them in again with the correct configuration from
> the start.
>
>  Sam
>
>
>
> On 26 April 2013 06:07, David McNelis <dmcnelis@gmail.com> wrote:
>
>> So, I had 7 nodes that I set up using vnodes, 256 tokens each, no problem.
>>
>> I added two 512 token nodes, no problem, things seemed to balance.
>>
>> The next 3 nodes I added, all at 256 tokens, and they have a cumulative
>> load of 116mb (where as the other nodes are at ~100GB and ~200GB (256 and
>> 512 respectively).
>>
>> Anyone else seen this is 1.2.4?
>>
>> The nodes seem to join the cluster ok, and I have num_tokens set and have
>> tried both an empty initial_token and a commented out initial token, with
>> no change.
>>
>> I see nothing streaming with netstats either, though these nodes were
>> added days apart.  At first I thought I must have a hot key or something,
>> but that doesn't seem to be the case, since the node I thought that one was
>> on has evened out over the past couple of days with no new nodes added.
>>
>> I really *DON'T* want to deal with another shuffle....but what options do
>> I have, since vnodes "make it unneeded to balance the cluster"?  (which, at
>> the moment, seems like a load of bullshit).
>>
>
>
>
> --
> Sam Overton
> Acunu | http://www.acunu.com | @acunu
>

Mime
View raw message