incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Ec2Snitch to Ec2MultiRegionSnitch
Date Sun, 21 Apr 2013 18:34:51 GMT
> So I guess we have to switch to Ec2MultiRegionSnitch.
It depends on how you are connecting the regions. 
If the nodes can directly communicate with each other, say through a VPN, you may not need
to change it. 
If they are behind a NAT you will need to use it. 

When you change the snitch test first, make sure that nodes do not change their DC or rack.
There are potential problems changing from the PropertyFileSnitch but they will not affect
you. 

> Our C* cluster : C*1.2.2, 6 EC2 m1.xLarge in eu-west already running, wanting to add
3 m1.xLarge on us-east

I recommend using the same number of nodes in both DC's. 

> 1/ Change the yaml conf on each of the 6 eu-west existing nodes
>     - Ec2Snitch to Ec2MultiRegionSnitch
>     - uncomment the broadcast_address and set the public ip of the node
>     - let the private ip as defined right now the listen_address
>     - switch seeds from private to public IP
Sounds about right, remember to test and make limited changes. 
You may also want to enable SSL (see yaml) and/or usine a VPN or VPC between the DC's http://www.datastax.com/docs/1.0/cluster_architecture/replication

> 4/
>     - Add 3 nodes one by one with auto_bootstrap set to true.
> 5/
>     - Repair nodes (one by one)
>     - Cleanup nodes (one by one)
Make sure the code is using LOCAL_QUOURM
Add the nodes (I recommend 6) with auto_bootstrap: false added to the yaml.
update the keyspace replication strategy to add rf:3 for the new DC. 
Use nodetool rebuild on the new nodes to rebuild them from the us-west DC. 
You do not need to use cleanup, data is not moving in the original DC. The two DC's each have
copies. 

> a/ Do I have to move the tokens since I don't use vnodes yet ? How should I position
all these nodes ?
I prefer to use the offset method. Take the 6 tokens from your us-west DC and add 100 to them
for the new dc. 

> b/ Is it useful to add a seed from the new us-east data center in the yaml of all nodes
?
Yes. Have 3 from each. 

> c/ I am using the SimpleStrategy. Is it worth it/mandatory to change this strategy when
using multiple DC ?
Yes. You *MUST* change this, otherwise your code will have to wait for cross DC latency and
you will not be able to use the LOCAL_ or EACH_ CL levels. 

You need to do this first. 

There is some information out there on doing this. A change like this can result in data going
missing, so do some testing. If all you nodes in us-west are in the same AZ (the same Cassandra
rack) then you can make the change to NTS without an impact. If not it's going to be tricky.


> d/ With my 2 DC will I have 3 RF per DC or cross DC ?
Use the NTS and have RF 3 in each DC http://www.datastax.com/docs/1.1/cluster_architecture/replication#replication-strategy

> e/ Should I configure my C* client to use the C* nodes from their region as coordinators
 (which seems to me the good way) or should I configure all the servers everywhere ?
use the local nodes only. 


First thing is to update the replication strategy and get the code using local_quorum. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/04/2013, at 2:41 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> Hi,
> 
> The company I work for is having so much success that we are expanding worldwide :).
We have to deploy our Cassandra servers worldwide too in order to improve the latency of our
new abroad customers.
> 
> I am wondering about the process to grow from one data center to a few of them. First
thing is we use EC2Snitch for now. So I guess we have to switch to Ec2MultiRegionSnitch.
> 
> Is that doable without any down-time ? 
> 
> Our C* cluster : C*1.2.2, 6 EC2 m1.xLarge in eu-west already running, wanting to add
3 m1.xLarge on us-east
> 
> I was planning to do it this way:
> 
> 1/ Change the yaml conf on each of the 6 eu-west existing nodes
>     - Ec2Snitch to Ec2MultiRegionSnitch
>     - uncomment the broadcast_address and set the public ip of the node
>     - let the private ip as defined right now the listen_address
>     - switch seeds from private to public IP
> 2/ Rolling restart
>     - nodetool disablegossip
>     - nodetool disablethrift
>     - nodetool drain
>     - rm /path/cassandra/commitlog/* ? (I used to do it since drain was broken to avoid
replaying counters logs, leading to overcounts, not sure how pertinent this is nowadays)
>     - service cassandra stop
>     - service cassandra start
> 3/
>     - Make sure everything is still running smoothly in eu-west servers
> 4/
>     - Add 3 nodes one by one with auto_bootstrap set to true.
> 5/
>     - Repair nodes (one by one)
>     - Cleanup nodes (one by one)
> 
> 
> Questions :
> 
> a/ Do I have to move the tokens since I don't use vnodes yet ? How should I position
all these nodes ?
> b/ Is it useful to add a seed from the new us-east data center in the yaml of all nodes
?
> c/ I am using the SimpleStrategy. Is it worth it/mandatory to change this strategy when
using multiple DC ?
> d/ With my 2 DC will I have 3 RF per DC or cross DC ?
> e/ Should I configure my C* client to use the C* nodes from their region as coordinators
 (which seems to me the good way) or should I configure all the servers everywhere ?
> 
> Any comment on the process described above would be appreciated, specially if you are
arguing that something is wrong about it.
> 
> If you find out I am missing something, I will be glad to hear about it.
> 
> Alain
> 


Mime
View raw message