incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: copy data from multi-node cluster to single node
Date Tue, 05 Jul 2011 00:58:04 GMT
> How do you change the name of a cluster?  The FAQ instructions do not seem to work for
me - are they still valid for 0.7.5?
> Is the backup / restore mechanism going to work, or is there a better/simpler to copy
data from multi-node to single-node?

Bug fixed on 0.7.6 https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L21

Also you should move to 0.7.6 to get the Gossip fix https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L6

When it comes to moving the data back to a single node I would:
- run repair
- snapshot prod node
- clear all data including the system KS data from the dev node
- copy the snapshot data for only your KS to the dev node into the correct directory, e.g.
data/<my-keyspace> . 
- start the dev node
- add your KS, the node will now load the data

Ignoring the system data means the dev node can sort it's cluster name and token out using
the yaml file. 

Even with 3 nodes and RF 3 it's impossible to ever say that one node has a complete copy of
the data. Running repair will make it more likely, but the node could drop a mutation message
during the repair or drop off gossip for few seconds. If you really want to have *everything*
from the prod cluster then copy the data from all 3 nodes onto the dev node and compact it
down. 

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 5 Jul 2011, at 03:05, Ross Black wrote:

> Hi,
> 
> I am using Cassandra 0.7.5 on Linux machines.
> 
> I am trying to backup data from a multi-node cluster (3 nodes) and restore it into a
single node cluster that has a different name (for development testing).
> 
> The multi-node cluster is backed up using clustertool global_snapshot, and then I copy
the snapshot from a single node and replace the data directory in the single node.
> The multi-node cluster has a replication factor of 3, so I assume that restoring any
node from the multi-node cluster will be the same.
> When started up this fails with a node name mismatch.
> 
> I have tried removing all the Location* files in the data directory (as per http://wiki.apache.org/cassandra/FAQ#clustername_mismatch)
but the single node then fails with an error message:
> org.apache.cassandra.config.ConfigurationException: Found system table files, but they
couldn't be loaded. Did you change the partitioner?
> 
> 
> How do you change the name of a cluster?  The FAQ instructions do not seem to work for
me - are they still valid for 0.7.5?
> Is the backup / restore mechanism going to work, or is there a better/simpler to copy
data from multi-node to single-node?
> 
> Thanks,
> Ross
> 


Mime
View raw message