cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jens Rantil" <jens.ran...@tink.se>
Subject Re: Cassandra backup via snapshots in production
Date Thu, 27 Nov 2014 10:34:06 GMT
Late answer; You can find my backup script here: https://gist.github.com/JensRantil/a8150e998250edfcd1a3


Basically you need to set S3_BUCKET, PGP_KEY_RECIPIENT, configure s3cmd (using s3cmd --configure)
and then issue `./backup-keyspace.sh your-keyspace` to backup it to S3. We run the script
is run periodically on every node.




Regarding “s3cmd --configure”, I executed it once and then copied “~/.s3cfg” to all
nodes.




Like I said, there’s lots of love that can be put into a backup system. Note that the script
has the following limitations:

 * It does not checksum the files. However s3cmd website states that it by default compares
MD5 and file size on upload.

 * It does not do purging of files on S3 (which you could configure using “Object Lifecycles”).

 * It does not warn you that a backup fails. Check your logs periodically.

 * It does not do any advanced logging. Make sure to pipe the output to a file or the `syslog`
utility.

 * It does not do continuous/point-in-time backup.




That said, it does its job for us for now.




Feel free to propose improvements!




Cheers,

Jens


———
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook Linkedin Twitter

On Fri, Nov 21, 2014 at 7:36 PM, William Arbaugh <waa@cs.umd.edu> wrote:

> Jens,
> I'd be interested in seeing your script. We've been thinking of doing exactly that but
uploading to Glacier instead.
> Thanks, Bill
>> On Nov 21, 2014, at 11:40 AM, Jens Rantil <jens.rantil@tink.se> wrote:
>> 
>> > The main purpose is to protect us from human errors (eg. unexpected manipulations:
delete, drop tables, …).
>> 
>> If that is the main purpose, having "auto_snapshot: true” in cassandra.yaml will
be enough to protect you.
>> 
>> Regarding backup, I have a small script that creates a named snapshot and for each
sstable; encrypts, uploads to S3 and deletes the snapshotted sstable. It took me an hour to
write and roll out to all our nodes. The whole process is currently logged, but eventually
I will also send an e-mail if backup fails.
>> 
>> ——— Jens Rantil Backend engineer Tink AB Email: jens.rantil@tink.se Phone:
+46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter
>> 
>> 
>> On Tue, Nov 18, 2014 at 3:52 PM, Ngoc Minh VO <ngocminh.vo@bnpparibas.com>
wrote:
>> 
>> Hello all,
>> 
>> 
>> 
>> 
>>  
>> 
>> We are looking for a solution to backup data in our C* cluster (v2.0.x, 16 nodes,
4 x 500GB SSD, RF = 6 over 2 datacenters).
>> 
>> 
>> 
>> The main purpose is to protect us from human errors (eg. unexpected manipulations:
delete, drop tables, …).
>> 
>> 
>> 
>> 
>>  
>> 
>> We are thinking of:
>> 
>> 
>> 
>> -          Backup: add a 2TB HDD on each node for C* daily/weekly snapshots.
>> 
>> 
>> 
>> -          Restore: load the most recent snapshots or latest “non-corrupted”
ones and replay missing data imports from other data source.
>> 
>> 
>> 
>> 
>>  
>> 
>> We would like to know if somebody are using Cassandra’s backup feature in production
and could share your experience with us.
>> 
>> 
>> 
>> 
>>  
>> 
>> Your help would be greatly appreciated.
>> 
>> 
>> 
>> Best regards,
>> 
>> 
>> 
>> Minh
>> 
>> 
>> 
>> 
>> This message and any attachments (the "message") is
>> intended solely for the intended addressees and is confidential. 
>> If you receive this message in error,or are not the intended recipient(s), 
>> please delete it and any copies from your systems and immediately notify
>> the sender. Any unauthorized view, use that does not comply with its purpose, 
>> dissemination or disclosure, either whole or partial, is prohibited. Since the internet

>> cannot guarantee the integrity of this message which may not be reliable, BNP PARIBAS

>> (and its subsidiaries) shall not be liable for the message if modified, changed or
falsified. 
>> Do not print this message unless it is necessary,consider the environment.
>> 
>> ----------------------------------------------------------------------------------------------------------------------------------
>> 
>> Ce message et toutes les pieces jointes (ci-apres le "message") 
>> sont etablis a l'intention exclusive de ses destinataires et sont confidentiels.
>> Si vous recevez ce message par erreur ou s'il ne vous est pas destine,
>> merci de le detruire ainsi que toute copie de votre systeme et d'en avertir
>> immediatement l'expediteur. Toute lecture non autorisee, toute utilisation de 
>> ce message qui n'est pas conforme a sa destination, toute diffusion ou toute 
>> publication, totale ou partielle, est interdite. L'Internet ne permettant pas d'assurer
>> l'integrite de ce message electronique susceptible d'alteration, BNP Paribas 
>> (et ses filiales) decline(nt) toute responsabilite au titre de ce message dans l'hypothese
>> ou il aurait ete modifie, deforme ou falsifie. 
>> N'imprimez ce message que si necessaire, pensez a l'environnement.
>> 
>> 
Mime
View raw message