On another thought I am writing a code/script for taking a backup of all the nodes in a single DC , renaming data files with some uid and then merging them . The storage however
would happen on some storage medium nas for ex which would be in the same DC. This would help in data copying a non hefty job.
Hopefully the one single DC data(from all the nodes in this DC) should give me the complete data just in case if RF >=1 .
The next improvement would be do do the same on incremental snapshots so that once you have a baseline data all the rest would be collecting chunks of increments alone and merging it with the original global snapshot.
I have do the same on each individual DC's.
Do you guys agree?
From: Tamar Fraenkel [firstname.lastname@example.org]
Sent: Tuesday, May 01, 2012 10:50 AM
Subject: Re: Taking a Cluster Wide Snapshot
Thanks for posting the script.
I see that the snapshot is always a full one, and if I understand correctly, it replaces the old snapshot on S3. Am I right?
Senior Software Engineer, TOK Media
On Thu, Apr 26, 2012 at 9:39 AM, Deno Vichas
here's how i'm doing in AWS land using the DataStax AMI via a nightly cron job.
you'll need pssh and s3cmd -
On 4/25/2012 11:34 PM, Shubham Srivastava wrote:
Whats the best way(or the only way) to take a cluster wide backup of Cassandra. Cant find much of the documentation on the same.
I am using a MultiDC setup with cassandra 0.8.6.
echo "making snapshots"
pssh -h prod-cassandra-nodes.txt -l ubuntu -P 'nodetool -h localhost -p 7199 clearsnapshot stocktouch'
pssh -h prod-cassandra-nodes.txt -l ubuntu -P 'nodetool -h localhost -p 7199 snapshot stocktouch'
echo "making tar balls"
pssh -h prod-cassandra-nodes.txt -l ubuntu -P -t 0 'rm `hostname`-cassandra-snapshot.tar.gz'
pssh -h prod-cassandra-nodes.txt -l ubuntu -P -t 0 'tar -zcvf `hostname`-cassandra-snapshot.tar.gz /raid0/cassandra/data/stocktouch/snapshots'
echo "coping tar balls"
pslurp -h prod-cassandra-nodes.txt -l ubuntu /home/ubuntu/*cassandra-snapshot.tar.gz .
echo "tar'ing tar balls"
tar -cvf cassandra-snapshots-all-nodes.tar 10*
echo "pushing to S3"
../s3cmd-1.1.0-beta3/s3cmd put cassandra-snapshots-all-nodes.tar s3://stocktouch-backups