Hi Shyamal,

I was using the same cluster name but since writing that first email, I've already had success bringing up nodes in the analysis cluster with a different cluster name after deleting the LocationInfo* tables.  

How have you been setting the tokens in the copied version of the cluster?  Are you just mapping them one-to-one on the original cluster? 

On Sun, Oct 2, 2011 at 3:49 PM, Shyamal Prasad <shyamal@member.fsf.org> wrote:
>>>>> "Eric" == Eric Czech <eric@nextbigsound.com> writes:

   Eric> We're exploring a data processing procedure where we snapshot
   Eric> our production cluster data and move that data to a new
   Eric> cluster for analysis but I'm having some strange issues where
   Eric> the analysis cluster is still somehow aware of the production
   Eric> cluster (i.e. the production cluster ring is trying to include
   Eric> nodes from the other cluster with the same token).

Are you using the same cluster name in for both clusters? If so, I would
suggest you don't.

   Eric> The seed addresses in cassandra.yaml definitely prohibit this
   Eric> type of intersection between the two clusters so I'm guessing
   Eric> that it has something to do with the information in the system
   Eric> sstables.

I'm sure you will get a more knowledgeable answer from people who have
been doing this for a while: but I have to ask are copying over the
LocationInfo* SSTables from the snapshot to the analysis cluster?

The LocationInfo CF can record the endpoints in your production cluster.
>From the little I've read of the code (StorageService.java and
SystemTable.java) it is possible (likely?) that endpoints from your
production cluster will get added to your analysis cluster's Gossiper on
startup. If you are using the same cluster name, well, there you have
it.....

   Eric> Is there anyway to duplicate raw sstables in an effort to
   Eric> "copy" a cluster such that the copied cluster has a different
   Eric> name?  I know this usually results in a "saved cluster name X
   Eric> != Y" sort of error but it looks like we need to find some
   Eric> sort of way to do this logical separation.

Copying the raw tables and ignoring/deleting the
data/system/LocationInfo* files has worked for me. But I have to add the
disclaimer that I'm definitely a Cassandra newbie!

Cheers!
Shyamal