cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Patil <vijay2110.t...@gmail.com>
Subject Re: cross DC data sync starts without rebuilding nodes on new DC
Date Wed, 06 Apr 2016 04:08:59 GMT
Thanks Alain and Sean, your detailed explanation answers my question.

Yes, nodetool status reflects new DC and nodetool netstats says not "No
Streams".
My all writes going to old DC with local_quorum. Yes this new data might be
getting synced into new DC (repair was not running anywhere).
I will proceed with rebuilding nodes on new DC.

Thanks,
Vijay

On 5 April 2016 at 18:56, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> Hi Vijay.
>
> After completing step 7.a (which is altering keyspace with desired
>> replication factor for new DC) data automatically starts syncing from
>> existing DC to new DC
>>
>
> My guess: what you are seeing is not data syncing. Well it is, but not old
> data being streamed but new writes being replicated. As soon as you set the
> RF for the new DC, it starts accepting writes.
>
> Some background:
> Using a Local_X consistency level means the operation to copy data to all
> the DC won't happen, it means coordinator won't wait for ack from other DC
> nodes, but write should reach all the DC set in the keyspace configuration.
> So as soon as you say I want X copies of the data on the new Datacenter,
> new data start to be replicated there.
>
> To check:
>
> Are you writing in your original DC?
> Is the output of 'nodetool netstats' saying 'No streams' as I expect?
>
> When rebuilding run this command again and you should see streams.
>
> Any idea why it's happening?
>> If this is the desired behaviour then what's the purpose of rebuilding
>> each node on new DC (step 7.b)?
>>
>
> So basically, the rebuild allows the new cluster to have the *old* /
> *existing* data streamed from an other DC. We use rebuild instead of
> auto_bootstrap to avoid nodes trying to stream data as soon as they are
> added to the new DC because we want to add *all* the nodes, to have ranges
> distributed evenly before starting streaming to stream just the correct
> amount of data from the DC of our choice.
>
> C*heers,
> -----------------------
> Alain Rodriguez - alain@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> 2016-04-05 8:26 GMT+02:00 Vijay Patil <vijay2110.tech@gmail.com>:
>
>> Hi,
>>
>> I have configured new DC as per instructions at below link.
>>
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>>
>> After completing step 7.a (which is altering keyspace with desired
>> replication factor for new DC) data automatically starts syncing from
>> existing DC to new DC, which is not expected because auto_bootstrap is
>> false on all nodes (existing as well as new DC).
>>
>> Any idea why it's happening?
>> If this is the desired behaviour then what's the purpose of rebuilding
>> each node on new DC (step 7.b)?
>>
>> Cassandra version is 2.0.17 on all nodes in both DC's and I am using
>> GossipingPropertyFileSnitch.
>>
>> Regards,
>> Vijay
>>
>
>

Mime
View raw message