incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yiming Sun <yiming....@gmail.com>
Subject mysterious data disappearance - what happened?
Date Fri, 09 Sep 2011 02:43:33 GMT
Hello,

If two different instances of cassandra are running on separate machines,
but both are unfortunately configured to use the default cluster name, "Test
Cluster", do they gang up as one cluster (even though they were intended to
be two separate stand-alone instances), so that dropping keyspace on one
machine would result in the disappearance of data from the other?

We ran into some mysterious data disappearance situations -- and we are not
sure if this is something with our configuration.

We installed 2 instances of cassandra 0.8.2 on 2 separate machines, one is
silk.cs.our.domain and the other smoke.cs.our.domain.  Both the code and
data directories reside on the local partition of each machine, and we left
the the yaml file pretty much the default, except we changed the data
directory paths and also the listen addresses to point to each host.  What I
want to point out is, both files had the default cluster name "Test
Cluster".

Both cassandra instances were loaded with identical keyspaces, data sets,
and the intention was one would be used as production and the other testing
cassandra server.

Then yesterday we were trying to drop a keyspace on silk, which was the
testing server, and sometimes later, a user complained that he could not get
any data from the production machine.  We were intrigued but thought maybe
someone made a mistake.

Then the same thing happened again today.  So we began to suspect that two
servers somehow discovered each other.  In fact, after we shut down the
cassandra on silk, and then load a keyspace definition into smoke using
cassandra-cli, it even said unreachable destination and the IP was that of
silk.  I just want to confirm that the culprit was because we used the same
cluster name for both machines.  So in other words, if we want to launch two
separate instances of cassandra and keep them separate, we must make sure
each uses a different cluster name or else they will gang up into the same
cluster?  But how do they even discover each other?  Can someone enlighten
me please?  Thanks.


-- Y.

Mime
View raw message