cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paulo Motta <pauloricard...@gmail.com>
Subject Re: Streaming from 1 node only when adding a new DC
Date Wed, 15 Jun 2016 13:25:31 GMT
For rebuild, replace and -Dcassandra.consistent.rangemovement=false in
general we currently pick the closest replica (as indicated by the Snitch)
which has the range, what will often map to the same node due to the
dynamic snitch, specially when N=RF. This is good for picking a node in the
same DC or rack for transferring, but we can probably improve this to
distribute streaming load more evenly within candidate source nodes in the
same rack/DC.

Would you mind opening a ticket for improving this?

2016-06-14 17:35 GMT-03:00 Fabien Rousseau <fabifabi95@gmail.com>:

> We've tested with C* 2.1.14 version
> Yes VNodes with 256 tokens
> Once all the nodes in dc2 are added, schema is modified to have RF=3 in
> dc1 and RF=3 in dc2.
> Then on each nodes of dc2:
> nodetool rebuild dc1
> Le 14 juin 2016 10:39, "kurt Greaves" <kurt@instaclustr.com> a écrit :
>
>> What version of Cassandra are you using? Also what command are you using
>> to run the rebuilds? Are you using vnodes?
>>
>> On 13 June 2016 at 09:01, Fabien Rousseau <fabifabi95@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> We've tested adding a new DC from an existing DC having 3 nodes and RF=3
>>> (ie all nodes have all data).
>>> During the rebuild process, only one node of the first DC streamed data
>>> to the 3 nodes of the second DC.
>>>
>>> Our goal is to minimise the time it takes to rebuild a DC and would like
>>> to be able to stream from all nodes.
>>>
>>> Starting C* with debug logs, it appears that all nodes, when computing
>>> their "streaming plan" returns the same node for all ranges.
>>> This is probably because all nodes in DC2 have the same view of the ring.
>>>
>>> I understand that when bootstrapping a new node, it's preferable to
>>> stream from the node being replaced, but when rebuilding a new DC, it
>>> should probably select sources "randomly" (rather than always selecting the
>>> same source for a specific range).
>>> What do you think ?
>>>
>>> Best Regards,
>>> Fabien
>>>
>>
>>
>>
>> --
>> Kurt Greaves
>> kurt@instaclustr.com
>> www.instaclustr.com
>>
>

Mime
View raw message