incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: best practices on EC2 question
Date Fri, 17 May 2013 18:13:07 GMT
>  b) do people skip backups altogether except for huge outages and just let rebooted server
instances come up empty to repopulate via C*?
This one. 
Bootstrapping a new node into the cluster has a small impact on the existing nodes and the
new nodes to have all the data they need when the finish the process.

Cheers
  
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/05/2013, at 3:17 AM, Janne Jalkanen <Janne.Jalkanen@ecyrd.com> wrote:

> On May 16, 2013, at 17:05 , Brian Tarbox <tarbox@cabotresearch.com> wrote:
> 
>> An alternative that we had explored for a while was to do a two stage backup:
>> 1) copy a C* snapshot from the ephemeral drive to an EBS drive
>> 2) do an EBS snapshot to S3.
>> 
>> The idea being that EBS is quite reliable, S3 is still the emergency backup and copying
back from EBS to ephemeral is likely much faster than the 15 MB/sec we get from S3.
> 
> Yup, this is what we do.  We use rsync with --bwlimit=4000 to copy the snapshots from
the eph drive to EBS; this is intentionally very low so that the backup process does not take
eat our I/O.  This is on m1.xlarge instances; YMMV so measure :).  EBS drives are then snapshot
with ec2-consistent-snapshot and then old snapshots expired using ec2-expire-snapshots (I
believe these scripts are from Alestic).
> 
> /Janne
> 


Mime
View raw message