cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: copy data from multi-node cluster to single node
Date Tue, 05 Jul 2011 02:59:09 GMT
On Tue, Jul 5, 2011 at 8:58 AM, aaron morton <aaron@thelastpickle.com>wrote:

> How do you change the name of a cluster?  The FAQ instructions do not seem
> to work for me - are they still valid for 0.7.5?
> Is the backup / restore mechanism going to work, or is there a
> better/simpler to copy data from multi-node to single-node?
>
>
> Bug fixed on 0.7.6
> https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L21
>
>
> <https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L21>Also
> you should move to 0.7.6 to get the Gossip fix
> https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L6
>
> <https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L6>When
> it comes to moving the data back to a single node I would:
> - run repair
> - snapshot prod node
> - clear all data including the system KS data from the dev node
> - copy the snapshot data for only your KS to the dev node into the correct
> directory, e.g. data/<my-keyspace> .
> - start the dev node
> - add your KS, the node will now load the data
>
> Ignoring the system data means the dev node can sort it's cluster name and
> token out using the yaml file.
>
> Even with 3 nodes and RF 3 it's impossible to ever say that one node has a
> complete copy of the data. Running repair will make it more likely, but the
> node could drop a mutation message during the repair or drop off gossip for
> few seconds. If you really want to have *everything* from the prod cluster
> then copy the data from all 3 nodes onto the dev node and compact it down.
>

Is it possible the snapshots from different nodes have the same name?


>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5 Jul 2011, at 03:05, Ross Black wrote:
>
> Hi,
>
> I am using Cassandra 0.7.5 on Linux machines.
>
> I am trying to backup data from a multi-node cluster (3 nodes) and restore
> it into a single node cluster that has a different name (for development
> testing).
>
> The multi-node cluster is backed up using clustertool global_snapshot, and
> then I copy the snapshot from a single node and replace the data directory
> in the single node.
> The multi-node cluster has a replication factor of 3, so I assume that
> restoring any node from the multi-node cluster will be the same.
> When started up this fails with a node name mismatch.
>
> I have tried removing all the Location* files in the data directory (as per
> http://wiki.apache.org/cassandra/FAQ#clustername_mismatch) but the single
> node then fails with an error message:
> org.apache.cassandra.config.ConfigurationException: Found system table
> files, but they couldn't be loaded. Did you change the partitioner?
>
>
> How do you change the name of a cluster?  The FAQ instructions do not seem
> to work for me - are they still valid for 0.7.5?
> Is the backup / restore mechanism going to work, or is there a
> better/simpler to copy data from multi-node to single-node?
>
> Thanks,
> Ross
>
>
>

Mime
View raw message