Is it possible the snapshots from different nodes have the same name?
The directory name will be made up of the current timestamp on the machine and the optional name passed via the command line. 

The SSTables from different nodes may have name collisions. If you are aggregating data from multiple nodes onto one you will need to manually update them. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton

On 5 Jul 2011, at 14:59, Zhu Han wrote:

On Tue, Jul 5, 2011 at 8:58 AM, aaron morton <aaron@thelastpickle.com> wrote:
How do you change the name of a cluster?  The FAQ instructions do not seem to work for me - are they still valid for 0.7.5?
Is the backup / restore mechanism going to work, or is there a better/simpler to copy data from multi-node to single-node?


Also you should move to 0.7.6 to get the Gossip fix https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/CHANGES.txt#L6

When it comes to moving the data back to a single node I would:
- run repair
- snapshot prod node
- clear all data including the system KS data from the dev node
- copy the snapshot data for only your KS to the dev node into the correct directory, e.g. data/<my-keyspace> . 
- start the dev node
- add your KS, the node will now load the data

Ignoring the system data means the dev node can sort it's cluster name and token out using the yaml file. 

Even with 3 nodes and RF 3 it's impossible to ever say that one node has a complete copy of the data. Running repair will make it more likely, but the node could drop a mutation message during the repair or drop off gossip for few seconds. If you really want to have *everything* from the prod cluster then copy the data from all 3 nodes onto the dev node and compact it down. 

Is it possible the snapshots from different nodes have the same name?
 

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton

On 5 Jul 2011, at 03:05, Ross Black wrote:

Hi,

I am using Cassandra 0.7.5 on Linux machines.

I am trying to backup data from a multi-node cluster (3 nodes) and restore it into a single node cluster that has a different name (for development testing).

The multi-node cluster is backed up using clustertool global_snapshot, and then I copy the snapshot from a single node and replace the data directory in the single node.
The multi-node cluster has a replication factor of 3, so I assume that restoring any node from the multi-node cluster will be the same.
When started up this fails with a node name mismatch.

I have tried removing all the Location* files in the data directory (as per http://wiki.apache.org/cassandra/FAQ#clustername_mismatch) but the single node then fails with an error message:
org.apache.cassandra.config.ConfigurationException: Found system table files, but they couldn't be loaded. Did you change the partitioner?


How do you change the name of a cluster?  The FAQ instructions do not seem to work for me - are they still valid for 0.7.5?
Is the backup / restore mechanism going to work, or is there a better/simpler to copy data from multi-node to single-node?

Thanks,
Ross