cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Expanding 0.6.x cluster to multiple datacenters
Date Thu, 28 Jul 2011 02:55:36 GMT
As you know, with 0.6 adding a datacenter is not as easy as 0.7 with
NetworkTopologyStrategy.  With 0.6 there is a right way that will work
with some manual effort, and a wrong way that can cause you major pain
and grief.

The right way:
- Switch to a DC-aware snitch but leave your cluster on RUS to start with.
- Bootstrap the 2nd datacenter nodes (halfway) in between your 1st
datacenter tokens, so your ring alternates DC1 DC2 DC1 DC2 etc.  Do
this one at a time for minimum disruption.  You should have equal node
counts in each DC because RAS will keep data in each DC about equal.
- Switch the cluster to RAS
- Start repair.  You will need to run repair on each node.  In 0.6 you
should only run repair against one node at a time.
- While repair is going on, you need to do reads at at least CL.QUORUM
or data may appear to be missing, since it's not yet in all the places
the new strategy will look.  (But by alternating DC around the ring, 2
of the 3 replicas are guaranteed to be the same for both RUS and RAS.)

The wrong way:
- Switch to RAS, then start adding nodes in the new DC.  As soon as
you add the first node in DC2, RAS will try to replicate ALL the rows
in DC1 to it.  Usually this overwhelms the DC2 node and it dies a
fiery death.

On Wed, Jul 27, 2011 at 7:44 PM, Ashley Martens <> wrote:
> I have a current 0.6.x cluster in a single datacenter with RackUnaware and
> am looking to expand into a second data center. I know I need to change to
> RackAwareStrategy however, I'm not sure what will happen to my data when I
> restart the nodes in the current cluster before I even add the new DC. Will
> the data need to move based on the rack each node is in or will it stay on
> the node it is currently on? Also, when I start adding nodes in the new DC
> to the cluster should they come in one at a time, like bootstrap, or should
> I light up several at the same time to distribute the data?
> For reference I have 19 nodes in my cluster.
> Thanks.

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support

View raw message