cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russ Lavoie <ussray...@yahoo.com>
Subject Cassandra Snapshots giving me corrupted SSTables in the logs
Date Fri, 28 Mar 2014 18:15:33 GMT
We are using cassandra 1.2.10 (With JNA installed) on ubuntu 12.04.3 and are running our instances
in Amazon Web Services.

What I am trying to do.

Our cassandra systems data is on an EBS volume so we can take snapshots of the data and create
volumes based on those snapshots and restore them where we want to.

The snapshot process 

Step 1
Login to  the cassandra node.

Step 2
Run nodetool clearsnapshot

Step 3
Run nodetool snapshot

Step 4
Take EBS snapshot

The above steps are performed only after the previous command returns.

Restore Process

Step 1
Remove data/system, commit_log and the saved_caches data/<keyspace>/* (excluding the
snapshot directory)

Step 2
Move all snapshot files into their respective KS/CF locations

Step 3
Start Cassandra

Step 4 
Create the schema

Step 5
Look at the log.  This is where I find a corrupted sstable in our keyspace (not system).

Trouble shooting

I suspected a race condition so I did the following:

I inserted a sleep for 60 seconds after issuing “nodetool clearsnapshot” 
I inserted a sleep for 60 seconds after issuing “nodetool snapshot”

Took the snapshot
Restored the snapshot as stated above following those same steps.
It worked with no problem at all.

So my assumption is that Cassandra is doing a few more things after the “nodetool snapshot”
returns.

Now that you know what is going on, I have my question.

How can I tell when a snapshot is fully complete so I do not have corrupted SSTables?

I can reproduce this 100% of the time.

Thanks for your help

Mime
View raw message