cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean Tremblay <>
Subject Catastrophy Recovery.
Date Mon, 15 Jun 2015 09:13:40 GMT


I have a cluster of 3 nodes RF: 2.
There are about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput has been change.

I am have tested a scenario where one node crashes and loose all its data.
I have deleted all data on this node after having stopped Cassandra.
At this point I noticed that the cluster was giving proper results. What I was expecting from
a cluster DB.

I then restarted that node and I observed that the node was joining the cluster.
After an hour or so the old “defect” node was up and normal.
I noticed that its hard disk loaded with much less data than its neighbours.

When I was querying the DB, the cluster was giving me different results for successive identical
I guess the old “defect” node was giving me less rows than it should have.

1) For what I understand, if you have a fixed node with no data it will automatically bootstrap
and recover all its old data from its neighbour while doing the joining phase. Is this correct?
2) After such catastrophe, and after the joining phase is done should the cluster not be ready
to deliver always consistent data if there was no inserts or delete during the catastrophe?
3) After the bootstrap of a broken node is finish, i.e. after the joining phase, is there
not simply a repair to be done on that node using “node repair"?

Thanks for your comments.

Kind regards


View raw message