For background

If you it for a single node then yes there is a chance of inconsistency across CF's. 

If you have mulitple nodes the snashots you take on the later nodes will help. If you use CL QUOURM for reads you *may* be ok (cannot work it out quickly.). If you use CL ALL for reads you will be ok. Or you can use nodetool repair to ensure the data is consistent. 

I doubt that even using repair would give you a provable guarantee though. Anyone ?


Aaron Morton
Freelance Cassandra Developer
New Zealand


On 6/12/2012, at 7:56 AM, Andrey Ilinykh <> wrote:

Hello, everybody!
I have production cluster with incremental backup on and I want to clone it (create test one). I don't understand one thing- each column family gets flushed (and copied to backup storage) independently. Which means the total snapshot is inconsistent. If I restore from such snapshot  I have totally useless system. To be more specific, let's say I have two CF, one serves as an index for another. Every time I update one CF I update index CF. There is a good chance that all replicas flush index CF first. Then I move it into backup storage, restore and get CF which has pointers to non existent data in another CF. What is a way to avoid this situation?

Thank you,