cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Slater <>
Subject Re: AWS ephemeral instances + backup
Date Thu, 05 Dec 2019 21:55:41 GMT
We have some tooling that does that kind of thing using S3 rather  than
attached EBS but a similar principle. There is a bit of an overview here:

It's become a pretty core part of our ops toolbox since we introduced it.



*Ben Slater**Chief Product Officer*


<>   <>

Read our latest technical blog posts here

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

On Fri, 6 Dec 2019 at 08:32, Jeff Jirsa <> wrote:

> No experience doing it that way personally, but I'm curious: Are you
> backing up in case of ephemeral instance dying, or backing up in case of
> data problems / errors / etc?
> On instance dying, you're probably fine with just straight normal
> replacements, not restoring from backup. For the rest, is it cheaper to use
> something like tablesnap and go straight to s3?
> On Thu, Dec 5, 2019 at 12:21 PM Carl Mueller
> <> wrote:
>> Does anyone have experience tooling written to support this strategy:
>> Use case: run cassandra on i3 instances on ephemerals but synchronize the
>> sstables and commitlog files to the cheapest EBS volume type (those have
>> bad IOPS but decent enough throughput)
>> On node replace, the startup script for the node, back-copies the
>> sstables and commitlog state from the EBS to the ephemeral.
>> As can be seen:
>> the (presumably) spinning rust tops out at 2375 MB/sec (using
>> multiple EBS volumes presumably) that would incur about a ten minute delay
>> for node replacement for a 1TB node, but I imagine this would only be used
>> on higher IOPS r/w nodes with smaller densities, so 100GB would be about a
>> minute of delay only, already within the timeframes of an AWS node
>> replacement/instance restart.

View raw message