incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nimi Wariboko Jr <nimiwaribo...@gmail.com>
Subject Data Loss/Missing With Cassandra
Date Sun, 09 Jun 2013 00:56:42 GMT
Hi, 

We are seeing an issue where data that was written to the cluster is no longer accessible
after trying to expand the size of the cluster. I will try and provide as much information
as possible, I am just starting at with Cassandra and I'm not entirely sure what data is relevant.

All Cassandra nodes are 1.2.5, and each node has the same config. 

We started out moving our entire data set to a single cassandra node. This node was initially
set up with Initial Token : 0, as well as other default settings. After we had gotten all
the data moved over we decided to add 2 more nodes, as well as up the RF to 2. We also decided
to start using vnodes which meant setting num_tokens to 256 and removing the initial token
param. We then decided to run cassandra-shuffle as well.

During cassandra-shuffle we started to notice some rows were disappearing then reappearing,
and other rows haven't come back at all. I decided to stop the shuffle and repair each node
then restart the cluster, however all the data hasn't come back. Note that this is CONSISTENCY
ALL

Here is my `nodetool status` What is weird here is the token distribution 260-239-1. I'm not
an expert but I believe it should be 256-256-256, or at least add up to 768.

Datacenter: 129
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns   Host ID                               Rack
UN  10.129.196.4  371.56 GB  260     38.1%  cde6c3be-a066-47f2-abc2-b1d78bee0d7c  196
UN  10.129.196.5  212.64 GB  239     61.5%  2cb24510-2f89-46b2-96b9-873f8e8e50da  196
UN  10.129.196.6  256.05 GB  1       0.4%   ce8d4ea9-8106-44b3-a2dd-c0230eb53c94  196


(http://pastebin.com/37SwNaGq)

And here is the opscenter ring view (http://imgur.com/VssmFlw)

What also weird is the token count from nodetool -h [host] info differs from status. 

Example:
root@cass1:~# nodetool -h cass1 info | grep Token
Token            : (invoke with -T/--tokens to see all 239 tokens)
root@cass1:~# nodetool -h cass2 info | grep Token
Token            : (invoke with -T/--tokens to see all 269 tokens)
root@cass1:~# nodetool -h cass3 info | grep Token
Token            : (invoke with -T/--tokens to see all 260 tokens)


(Full output: http://pastebin.com/2hxpArt0)

I believe it has something to do with the cluster not "seeing" all the tokens, but I am not
sure where to continue from here. I don't believe any data was lost there was no power outage,
and all the data should have been committed to disk before we added the two other nodes.

Thanks,
Nimi
nimiwaribokoj@gmail.com

Mime
View raw message