Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of al3xdm@gmail.com designates
 209.85.192.49 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAL66bxhpSvkpCURPA_zymtU0nALVaVK5StNsU1FJmg0BZJkz9A@mail.gmail.com>
References: 
 <CAL66bxhpSvkpCURPA_zymtU0nALVaVK5StNsU1FJmg0BZJkz9A@mail.gmail.com>
Date: Thu, 24 Jul 2014 11:07:23 +0100
Message-ID: 
 <CAMt1n-DSUNWm34DLd0sGiZUU5wM8qtEa6VdYnEbZDKo3bAY5xg@mail.gmail.com>
Subject: Re: Cassandra on AWS suggestions for data safety
From: Alex Major <al3xdm@gmail.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=001a1133da3086d79404feed9f9a

--001a1133da3086d79404feed9f9a
Content-Type: text/plain; charset=UTF-8

On Thu, Jul 24, 2014 at 12:12 AM, Hao Cheng <bryan@critica.io> wrote:

> Hello,
>
> Based on what I've read in the archives here and on the documentation on
> Datastax and the Cassandra Community, EBS volumes, even provisioned IOPS
> with EBS optimized instances, are not recommended due to inconsistent
> performance. This I can deal with, but I was hoping for some
> recommendations from the community as far as solutions for data safety.
>
> I have a few ideas in mind:
>
> 1. Instance store for the database, then cassandra snapshots (via
> nodetool), stored on an EBS provisioned IOPS volume attached to the
> instance. That volume would serve to keep the DB safe in case of instance
> downtime, and I would set up regular snapshotting on the EBS volume for
> data safety (pushed to S3 and eventually glacier)
>
> 2. Instance store used as a bcache write-through cache for attached EBS
> volumes. The attached volumes persist all writes and are again snapshotted
> regularly.
>
> 3. Using a backup system, either manually via rsync or through something
> like Priam, to directly push backups of the data on ephemeral storage to S3.
>
> From where I'm sitting, #2 seems the easiest to set up, but could
> potentially cause problems if the EBS volume backing writes sees a spike in
> latency, driving up write times even if read times would remain fairly
> consistent.
>
> Do any of you all have recommendations or suggestions for a system like
> this?
>
> Thanks in advance!
>
> --Bryan
>

We have a cluster running that uses EBS with Provisioned Iops and we get
good performance off them (comparable to instance store). The reason we're
moving off them is purely because EBS has been the thing that most often
crashes on AWS. The AWS SSD instance types are where we're heading and I'd
recommend them if you can. Also make sure to keep at least 3 replicas,
things tend to fail more regularly so it'll keep you from having immediate
problems.

Our setup is to snapshot the instance stores and sync to S3. Not sure why
you'd sync to EBS really. Priam which you mentioned makes keeping backups
(snapshots) and storing them on S3 really simple -
https://github.com/Netflix/Priam/wiki/Backups

--001a1133da3086d79404feed9f9a
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On T=
hu, Jul 24, 2014 at 12:12 AM, Hao Cheng <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:bryan@critica.io" target=3D"_blank">bryan@critica.io</a>&gt;</span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex=
;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style=
:solid;padding-left:1ex">
<div dir=3D"ltr">Hello,<div><br></div><div>Based on what I&#39;ve read in t=
he archives here and on the documentation on Datastax and the Cassandra Com=
munity, EBS volumes, even provisioned IOPS with EBS optimized instances, ar=
e not recommended due to inconsistent performance. This I can deal with, bu=
t I was hoping for some recommendations from the community as far as soluti=
ons for data safety.</div>

<div><br></div><div>I have a few ideas in mind:</div><div><br></div><div>1.=
 Instance store for the database, then cassandra snapshots (via nodetool), =
stored on an EBS provisioned IOPS volume attached to the instance. That vol=
ume would serve to keep the DB safe in case of instance downtime, and I wou=
ld set up regular snapshotting on the EBS volume for data safety (pushed to=
 S3 and eventually glacier)</div>

<div><br></div><div>2. Instance store used as a bcache write-through cache =
for attached EBS volumes. The attached volumes persist all writes and are a=
gain snapshotted regularly.</div><div><br></div><div>3. Using a backup syst=
em, either manually via rsync or through something like Priam, to directly =
push backups of the data on ephemeral storage to S3.</div>

<div><br></div><div>From where I&#39;m sitting, #2 seems the easiest to set=
 up, but could potentially cause problems if the EBS volume backing writes =
sees a spike in latency, driving up write times even if read times would re=
main fairly consistent.</div>

<div><br></div><div>Do any of you all have recommendations or suggestions f=
or a system like this?</div><div><br></div><div>Thanks in advance!</div><di=
v><br></div><div>--Bryan</div></div>
</blockquote></div><br></div><div class=3D"gmail_extra">We have a cluster r=
unning that uses EBS with Provisioned Iops and we get good performance off =
them (comparable to instance store). The reason we&#39;re moving off them i=
s purely because EBS has been the thing that most often crashes on AWS. The=
 AWS SSD instance types are where we&#39;re heading and I&#39;d recommend t=
hem if you can. Also make sure to keep at least 3 replicas, things tend to =
fail more regularly so it&#39;ll keep you from having immediate problems.<d=
iv>
<br></div><div>Our setup is to snapshot the instance stores and sync to S3.=
 Not sure why you&#39;d sync to EBS really. Priam which you mentioned makes=
 keeping backups (snapshots) and storing them on S3 really simple -=C2=A0<a=
 href=3D"https://github.com/Netflix/Priam/wiki/Backups">https://github.com/=
Netflix/Priam/wiki/Backups</a></div>
</div></div>

--001a1133da3086d79404feed9f9a--