cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Major <>
Subject Re: Cassandra on AWS suggestions for data safety
Date Thu, 24 Jul 2014 10:07:23 GMT
On Thu, Jul 24, 2014 at 12:12 AM, Hao Cheng <> wrote:

> Hello,
> Based on what I've read in the archives here and on the documentation on
> Datastax and the Cassandra Community, EBS volumes, even provisioned IOPS
> with EBS optimized instances, are not recommended due to inconsistent
> performance. This I can deal with, but I was hoping for some
> recommendations from the community as far as solutions for data safety.
> I have a few ideas in mind:
> 1. Instance store for the database, then cassandra snapshots (via
> nodetool), stored on an EBS provisioned IOPS volume attached to the
> instance. That volume would serve to keep the DB safe in case of instance
> downtime, and I would set up regular snapshotting on the EBS volume for
> data safety (pushed to S3 and eventually glacier)
> 2. Instance store used as a bcache write-through cache for attached EBS
> volumes. The attached volumes persist all writes and are again snapshotted
> regularly.
> 3. Using a backup system, either manually via rsync or through something
> like Priam, to directly push backups of the data on ephemeral storage to S3.
> From where I'm sitting, #2 seems the easiest to set up, but could
> potentially cause problems if the EBS volume backing writes sees a spike in
> latency, driving up write times even if read times would remain fairly
> consistent.
> Do any of you all have recommendations or suggestions for a system like
> this?
> Thanks in advance!
> --Bryan

We have a cluster running that uses EBS with Provisioned Iops and we get
good performance off them (comparable to instance store). The reason we're
moving off them is purely because EBS has been the thing that most often
crashes on AWS. The AWS SSD instance types are where we're heading and I'd
recommend them if you can. Also make sure to keep at least 3 replicas,
things tend to fail more regularly so it'll keep you from having immediate

Our setup is to snapshot the instance stores and sync to S3. Not sure why
you'd sync to EBS really. Priam which you mentioned makes keeping backups
(snapshots) and storing them on S3 really simple -

View raw message