incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Backup strategies in a multi DC cluster
Date Sun, 24 Mar 2013 17:44:40 GMT
> There are advantages and disadvantages in both approaches. What are people doing in their
production systems?
Generally a mix of snapshots+rsync or https://github.com/synack/tablesnap to get things off
node. 

Cheers


-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 23/03/2013, at 4:37 AM, Jabbar Azam <ajazam@gmail.com> wrote:

> Hello,
> 
> I've been experimenting with cassandra for quite a while now.
> 
> It's time for me to look at backups but I'm not sure what the best practice is. I want
to be able to recover the data to a point in time before any user or software errors.
> 
> We will have two datacentres with 4 servers and RF=3.
> 
> Each datacentre will have at most 1.6 TB(includes replication, LZ4 compression, using
test data) of data. That is ten years of data after which we will start purging. This amounts
to about 400MB of data generation per day.
> 
> I've read about users doing snapshots of individual nodes to S3(Netflix) and I've read
 about creating virtual datacentres (http://www.datastax.com/dev/blog/multi-datacenter-replication)
where each virtual datacentre contains a backup node.
> 
> There are advantages and disadvantages in both approaches. What are people doing in their
production systems?
> 
> 
> 
> 
> -- 
> Thanks
> 
> Jabbar Azam


Mime
View raw message