incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Watson <j...@disqus.com>
Subject Re: Adding nodes in 1.2 with vnodes requires huge disks
Date Mon, 29 Apr 2013 23:27:06 GMT
Opened a ticket:

https://issues.apache.org/jira/browse/CASSANDRA-5525


On Mon, Apr 29, 2013 at 2:24 AM, aaron morton <aaron@thelastpickle.com>wrote:

> is this understanding correct "we had a 12 node cluster with 256 vnodes on
> each node (upgraded from 1.1), we added two additional nodes that streamed
> so much data (600+Gb when other nodes had 150-200GB) during the joining
> phase that they filled their local disks and had to be killed" ?
>
> Can you raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA and
> update the thread with the ticket number.
>
> Can you show the output from nodetool status so we can get a feel for the
> ring?
> Can you include the logs from one of the nodes that failed to join ?
>
> Thanks
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 29/04/2013, at 10:01 AM, John Watson <john@disqus.com> wrote:
>
> On Sun, Apr 28, 2013 at 2:19 PM, aaron morton <aaron@thelastpickle.com>wrote:
>
>>  We're going to try running a shuffle before adding a new node again...
>>> maybe that will help
>>>
>> I don't think  hurt but I doubt it will help.
>>
>
> We had to bail on shuffle since we need to add capacity ASAP and not in 20
> days.
>
>
>>
>>    It seems when new nodes join, they are streamed *all* sstables in the
>>>> cluster.
>>>>
>>>>
>>>>
>>>> How many nodes did you join, what was the num_tokens ?
>> Did you notice streaming from all nodes (in the logs) or are you saying
>> this in response to the cluster load increasing ?
>>
>>
> Was only adding 2 nodes at the time (planning to add a total of 12.)
> Starting with a cluster of 12, but now 11 since 1 node entered some weird
> state when one of the new nodes ran out disk space.
> num_tokens is set to 256 on all nodes.
> Yes, nearly all current nodes were streaming to the new ones (which was
> great until disk space was an issue.)
>
>>     The purple line machine, I just stopped the joining process because
>>>> the main cluster was dropping mutation messages at this point on a few
>>>> nodes (and it still had dozens of sstables to stream.)
>>>>
>>>> Which were the new nodes ?
>> Can you show the output from nodetool status?
>>
>>
> The new nodes are the purple and gray lines above all the others.
>
> nodetool status doesn't show joining nodes. I think I saw a bug already
> filed for this but I can't seem to find it.
>
>
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 27/04/2013, at 9:35 AM, Bryan Talbot <btalbot@aeriagames.com> wrote:
>>
>> I believe that "nodetool rebuild" is used to add a new datacenter, not
>> just a new host to an existing cluster.  Is that what you ran to add the
>> node?
>>
>> -Bryan
>>
>>
>>
>> On Fri, Apr 26, 2013 at 1:27 PM, John Watson <john@disqus.com> wrote:
>>
>>> Small relief we're not the only ones that had this issue.
>>>
>>> We're going to try running a shuffle before adding a new node again...
>>> maybe that will help
>>>
>>> - John
>>>
>>>
>>> On Fri, Apr 26, 2013 at 5:07 AM, Francisco Nogueira Calmon Sobral <
>>> fsobral@igcorp.com.br> wrote:
>>>
>>>> I am using the same version and observed something similar.
>>>>
>>>> I've added a new node, but the instructions from Datastax did not work
>>>> for me. Then I ran "nodetool rebuild" on the new node. After finished this
>>>> command, it contained two times the load of the other nodes. Even when I
>>>> ran "nodetool cleanup" on the older nodes, the situation was the same.
>>>>
>>>> The problem only seemed to disappear when "nodetool repair" was applied
>>>> to all nodes.
>>>>
>>>> Regards,
>>>> Francisco Sobral.
>>>>
>>>>
>>>>
>>>>
>>>> On Apr 25, 2013, at 4:57 PM, John Watson <john@disqus.com> wrote:
>>>>
>>>> After finally upgrading to 1.2.3 from 1.1.9, enabling vnodes, and
>>>> running upgradesstables, I figured it would be safe to start adding nodes
>>>> to the cluster. Guess not?
>>>>
>>>> It seems when new nodes join, they are streamed *all* sstables in the
>>>> cluster.
>>>>
>>>>
>>>> https://dl.dropbox.com/s/bampemkvlfck2dt/Screen%20Shot%202013-04-25%20at%2012.35.24%20PM.png
>>>>
>>>> The gray the line machine ran out disk space and for some reason
>>>> cascaded into errors in the cluster about 'no host id' when trying to store
>>>> hints for it (even though it hadn't joined yet).
>>>> The purple line machine, I just stopped the joining process because the
>>>> main cluster was dropping mutation messages at this point on a few nodes
>>>> (and it still had dozens of sstables to stream.)
>>>>
>>>> I followed this:
>>>> http://www.datastax.com/docs/1.2/operations/add_replace_nodes
>>>>
>>>> Is there something missing in that documentation?
>>>>
>>>> Thanks,
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>
>>
>>
>
>

Mime
View raw message