incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ray Sutton <>
Subject Re: help on backup muiltinode cluster
Date Sat, 07 Dec 2013 13:57:50 GMT
I have not use tablesnap but it appears that it does not necessarily depend
upon taking a cassandra snapshot. The example given in their documentation
shows the source folder as /var/lib/cassandra/data/GiantKeyspace, which is
the root of the "GiantKeyspace" keyspace. But, snapshots operate at the
column-family level and are stored in a sub directory structure for each
column family. For example, if we have 2 column families in GiantKeyspace,
called cf1 and cf2, the snapshots would be located in
/var/lib/cassandra/data/GiantKeyspace/cf1/snapshots/snapshot_id/ and
var/lib/cassandra/data/GiantKeyspace/cf2/snapshots/snapshot_id/, where
snapshot_id is some unique identifier for that snapshot. Unless tablesnap
will detect changes in subfolders, I don't know how you will tell tablesnap
the name of the actual snapshot folder before the snapshot is taken. I
think tablesnap's premise is that since a snapshot is a simply a hard link
to an existing sstable file and sstables are immutable, it will
simply operate on the original sstable, no need for making a snapshot.

However cassandra also performs compactions on sstables which combines
sstables into new sstables for the purpose of "de-fragging" row data to
optimize lookups. The pre-compaction sstables will be marked for deletion
and removed during the next GC. What this means to me is that you should
use snaphshots to preserve point-in-time state of the data. So there seems
to be a small problem to overcome if using snapshots and tablesnap.

Ideally to create a completely consistent point-in-time backup you would
stop client access to the cluster  (nodetool thriftdisable), execute a
flush to write memtables to disk, then execute the snapshot. In reality, if
you can execute the snapshot on all servers within a "short period of
time", for some value of 'short', your data will be relatively consistent.
If you ever needed to perform a restore from these snapshots, cassandra's
internal read repair feature would fixup any inconsistencies.

I use DataStax OpsCenter to take snapshots and then a homebrew python
script to upload to S3. OpsCenter sends the snapshot command to all servers
nearly simultaneously so the snapshots are executed almost in parallel.
This feature might only be available in the Enterprise version. You could
use a simple bash script to execute the nodetool snapshot command via ssh
to each server sequentially, or use a mutli-window ssh client ( csshX for
OSX ) to execute in true parallel fashion.

Ray  //o-o\\

On Sat, Dec 7, 2013 at 4:09 AM, Jason Wee <> wrote:

> Hmm... cassandra fundamental key features like fault tolerant, durable and
> replication. Just out of curiousity, why would you want to do backup?
> /Jason
> On Sat, Dec 7, 2013 at 3:31 AM, Robert Coli <> wrote:
>> On Fri, Dec 6, 2013 at 6:41 AM, Amalrik Maia <>wrote:
>>> hey guys, I'm trying to take backups of a multi-node cassandra and save
>>> them on S3.
>>> My idea is simply doing ssh to each server and use nodetool to create
>>> the snapshots then push then to S3.
>> So is this approach recommended? my concerns are about inconsistencies
>>> that this approach can lead, since the snapshots are taken one by one and
>>> not in parallel.
>>> Should i worry about it or cassandra finds a way to deal with
>>> inconsistencies when doing a restore?
>> The backup is as consistent as your cluster is at any given moment, which
>> is "not necessarily". Manual repair brings you closer to consistency, but
>> only on data present when the repair started.
>> =Rob

View raw message