incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vegard Berget" <p...@fantasista.no>
Subject Moving data from one datacenter to another
Date Wed, 19 Dec 2012 12:27:45 GMT
Hi,
I know this have been a topic here before, but I need some input on
how to move data from one datacenter to another (and google just gives
me some old mails) - and at the same time moving "production" writing
the same way.  To add the target cluster into the source cluster and
just replicate data before moving source nodes is not an option, but
my plan is as follows:1)  Flush data on source cluster and move all
data/-files to the destination cluster.  While this is going on, we
are still writing to the source cluster.2)  When data is copied,
start cassandra on the new cluster - and then move writing/reading to
the new cluster.3)  Now, do a new flush on the source cluster.  As I
understand, the sstable files are immutable, so the _newly added_
data/ files could be moved to the target cluster.4)  After new data
is also copied into the the target data/, do a nodetool -refresh to
load the new sstables into the system (i know we need to take care of
filenames). 

	It's worth noting that none of the data is critical, but it would be
nice to get it correct.  I know that there will be a short period
between 2 and 4 that reads potentially could read old data (written
while copying, reading after we have moved read/write).  This is ok
in this case.  Our second alternative is to:

	1) Drain old cluster
2) Copy to new cluster
3) Start new cluster

	This will cause the cluster to be unavailable for writes in the
copy-period, and I wish to avoid that (even if that, too, is
survivable).

	Both nodes are 11.6, but it might be that we upgrade the target to
1.1.7, as I can't see that this will cause any problems?   

	Questions:

	1)  It's the same number of nodes on both clusters, but does the
tokens need to be the same aswell?  (Wouldn't a repair correct that
later?)

	2)  Could data files have any name?  Could we, to avoid a filename
crash, just substitute the numbers with for example XXX in the
data-files?

	3)  Is this really a sane way to do things?  

	Suggestions are most welcome!

	Regards
Vegard Berget



Mime
View raw message