cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philipp Potisk <philipp.pot...@omnecon.com>
Subject Re: StreamException while adding nodes
Date Fri, 13 Jun 2014 06:51:25 GMT
As we are still failing to add the 3 additional nodes, we still appreciate
any further thoughts.

I have removed all 3 half-joined nodes, deleted the data-directories and
started only one node. Since than (more than 24h hoursa ago) the node is in
status JOINING (nodetool status: UJ, nodetool gossipinfo:
STATUS:BOOT,-7774403902045887560) but does not receive any data.

nodetool status shows that only 5,72MB has arrived so far:

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Owns (effective)  Host
ID                               Token
Rack
UJ  10.140.118.4    5.72 MB    ?
dc110f47-67b0-40c9-bef7-3dff59bfe29c
-9201583989361968764                     rack1
UN  10.53.186.53    29.59 GB   43.1%
80cb0036-33b9-4c37-b789-7dac340034ee
-9137279293977023905                     rack1
UN  10.140.120.27   25.27 GB   37.8%
2564094b-08ea-42c4-82b0-a8246bd3ebcf
-9201237785760477995                     rack1
UN  10.53.170.3     26.82 GB   38.1%
737f49e5-684f-46ef-bf8b-c82326128835
-9106630210265624873                     rack1
UN  10.140.104.105  27.88 GB   39.7%
18c74472-235d-4284-9906-0ab8cc40011d
-9213643688261125087                     rack1
UN  10.53.170.41    26.28 GB   41.3%
866d2276-0dac-41b3-aece-6a2711ef0234
-9031518559431277310                     rack1

Furthermore it is very strange that nodetool describering, does not have
the IP of the new node included in the endpoints-list. Command:

nodetool describering TransactionUseCaseAddNodes | grep 10.140.118.4

does not output anything.

It seems that no token-ranges are assigned to this node. However, according
the documentation regarding vnodes, rebalancing should be done
automatically.
Is there still a way to force rebalancing in Cassandra 2.X using vnodes? Or
is there something else I could look into?



On 11 June 2014 08:26, Philipp Potisk <philipp.potisk@omnecon.com> wrote:

> Hey Rob,
>
> thanks for pointing out the issue with simultaneous bootstraps. However, I
> am not sure if this applies in my case. As a matter of fact I did not start
> the nodes simultaneously - I waited about 10min until they were receiving
> streams from other nodes. So I guess the topology-changes were exchanged as
> expected. Only the joining of the 3 nodes was done simultaneously.
> The StreamException, which killed the process, also happened in a later
> point of time. Since than the nodes are not picking up the join-process
> again. I am now thinking of decommissioning and staring all over again.
>
> Phil
>
>
> On 11 June 2014 03:13, Robert Coli <rcoli@eventbrite.com> wrote:
>
>> On Tue, Jun 10, 2014 at 2:21 PM, Philipp Potisk <philipp.potisk@geroba.at
>> > wrote:
>>
>>> First I added one node, which joined after 120min successfully. During
>>> that time there was no additional load on the cluster. Afterwards I started
>>> the other 3 new nodes after each other in order to join the cluster
>>> simultaneously.
>>>
>>
>> Bootstrapping multiple nodes at once is now and has always been Not
>> Supported, but is such a common thing for new operators to try that there
>> is now a goal to prevent them from doing it [1].
>>
>> Cancel those simultaneous bootstraps and do them one at a time, and
>> they'll probably work.
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-7069
>>
>> =Rob
>>
>
>
>
> --
> DI Philipp Potisk
>
> Omnecon IT e.U.
>
> Klabundgasse 5-7/3/17
> 1190 Wien
>
> Tel.: +43 660 46 02 632
> E-Mail.: philipp.potisk@omnecon.com
>
> Firmenbuchnummer: FN 342255 t
> UID: ATU65503966
>



-- 
DI Philipp Potisk

Omnecon IT e.U.

Klabundgasse 5-7/3/17
1190 Wien

Tel.: +43 660 46 02 632
E-Mail.: philipp.potisk@omnecon.com

Firmenbuchnummer: FN 342255 t
UID: ATU65503966

Mime
View raw message