cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Maizel <j...@soundcloud.com>
Subject Re: Best Practice for Data Center Migration
Date Fri, 03 Dec 2010 10:26:15 GMT
Thanks for the followup.

I have a few follow on questions:

In the case of using decommission, any idea of what happens when we
get to the last node in the old data center?  Do you think it will
decommission properly?

I agree that this sounds like the easiest method.  We have to see if
we can support the storage requirement as we go down the cluster and
decommission.

In the case of changing the RF and dropping the entire old cluster
here's what I was thinking:

We change the RF to 4 which I take as meaning that there will be two
copies of data in each cluster.  So, if we just turn off all the nodes
in the old data center then we still have two copies of all data in
the new data center and then we can rebuild and cleanup things with
nodetool to get to a normal state.  We would then turn down the RF to
3 and rebuild in order to get back to our original config.  The reason
I thought this would work is that since RackAware alternates replica
placement and we have inserted the new data center nodes in between
the old key ranges evenly, a pair of nodes in the new DC would each
get a replica of the data. That would give us some redundancy until we
can rebuild.

I am probably making a bad assumption about the RackAwareStrategy that
blocks this.  If so, it'd be nice if you could explain it to me.

If you have another idea that might be worth discussing I'd appreciate it.

Thanks,

Jake

On Thu, Dec 2, 2010 at 6:11 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> On Thu, Dec 2, 2010 at 4:08 AM, Jake Maizel <jake@soundcloud.com> wrote:
>> Hello,
>>
>> We have a ring of 12 nodes with 6 in one data center and 6 in another.
>>  We want to shutdown all 6 nodes in data center 1 in order to close
>> it down.  We are using a replication factor of 3 and are using
>> RackAwareStrategy with version 0.6.6.
>>
>> We have been thinking that using decomission on each of the nodes in
>> the old data center one at a time would do the trick.  Does this sound
>> reasonable?
>
> That is the simplest approach.  The major downside is that
> RackAwareStrategy guarantees you will have at least one copy of _each_
> row in both DCs, so when you are down to 1 node in dc1 it will have a
> copy of all the data.  If you have a small enough data volume to make
> this feasible then that is the option I would go with.
>
>> We have also been considering increasing the replication factor to 4
>> and then just shutting down all the old nodes.  Would that work as far
>> as data availability would go?
>
> Not sure what you are thinking of there, but probably not. :)
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>



-- 
Jake Maizel
Network Operations
Soundcloud

Mail & GTalk: jake@soundcloud.com
Skype: jakecloud

Rosenthaler strasse 13, 101 19, Berlin, DE

Mime
View raw message