incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jabbar Azam <aja...@gmail.com>
Subject Re: Backup strategies in a multi DC cluster
Date Sun, 24 Mar 2013 19:19:27 GMT
Thanks Aaron. I have a hypothetical question.

Assume you have four nodes and a snapshot is taken.  The following day if a
node goes down and data is corrupt through user error then how do you use
the previouus nights snapshots?

Would you replace the faulty node first and then restore last nights
snapshot?  What happens if you don't have a replacement node? You won't be
able to restore last nights snapshot.

However if a virtual datacenter consisting of a backup node is used then
the backup node could be used regardless of the number of nodes in the
datacentre. Would there be any disadvantages approach?  Sorry for the
questions I want to understand all the options.
On 24 Mar 2013 17:45, "aaron morton" <aaron@thelastpickle.com> wrote:

> There are advantages and disadvantages in both approaches. What are people
> doing in their production systems?
>
> Generally a mix of snapshots+rsync or https://github.com/synack/tablesnap to
> get things off node.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 23/03/2013, at 4:37 AM, Jabbar Azam <ajazam@gmail.com> wrote:
>
> Hello,
>
> I've been experimenting with cassandra for quite a while now.
>
> It's time for me to look at backups but I'm not sure what the best
> practice is. I want to be able to recover the data to a point in time
> before any user or software errors.
>
> We will have two datacentres with 4 servers and RF=3.
>
> Each datacentre will have at most 1.6 TB(includes replication, LZ4
> compression, using test data) of data. That is ten years of data after
> which we will start purging. This amounts to about 400MB of data generation
> per day.
>
> I've read about users doing snapshots of individual nodes to S3(Netflix)
> and I've read  about creating virtual datacentres (
> http://www.datastax.com/dev/blog/multi-datacenter-replication) where each
> virtual datacentre contains a backup node.
>
> There are advantages and disadvantages in both approaches. What are people
> doing in their production systems?
>
>
>
>
> --
> Thanks
>
> Jabbar Azam
>
>
>

Mime
View raw message