cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <>
Subject Re: Cassandra Splitting databases
Date Fri, 04 Jan 2019 23:05:21 GMT
I encourage you to try all of these in a lab/non-prod environment before
you do this in production. And take backups. This is risky and you should
think about what you're doing before you do it.

The most practical way to do this with no downtime is to spin up a new
cluster in Azure and either do double writes or double reads until you

Double writes:

Make new cluster B
Double write to both database A and database B, reading only from database
Once you begin double writing, you can use sstableloader to backfill the
data from database A into database B.
Once you've finished the backfill, you switch your reads to database B

Double reads:

Make new cluster B
Double read from both A and B, merging the results on the application side
Move writes from A to B
Once you've started double reading, use sstable loader to backfill the data
from database A to database B
Once you've finished the backfill, turn off the double reads.

*There's no "supported" obvious way to split a single cluster. Doing so has
some amount of hassle, but is probably possible to do safely.*

- You'll move each dataset into its own keyspace in the datacenter you want
it to live in
- You'll then firewall off the two distinct clusters from each other. At
this point, database A will see all hosts in A up and B down, and vice
versa (B will see A down, B up).
- While the firewall rules are in place, run "nodetool assassinate" in
database A for the instances of database B, and vice versa. This will make
the instances of A think B is removed, and B think A is removed.
- While the firewall rules are in place, delete all hints of the old hosts
from system.peers and all seed lists
- After 72 hours, you should be able to remove the firewall rules.
- If you skip any of these steps, the clusters will re-join after the
firewall rules are removed, because they have the same cluster name.

On Fri, Jan 4, 2019 at 2:23 PM R1 J1 <> wrote:

> We currently  have  2 databases (A and B ) on a 6 node cluster.
> 3 nodes are on premise and 3 in azure.   I want  database A to live on
> onpremise cluster and  I want Database B to stay in the Azure.  I want to
> then split the cluster into 2 clusters one onpremise (3 node )  having
> Database A and other in Azure (3 node ) having Database B.
> How do we accomplish such a split ?
> Regards
> R1J1

View raw message