cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Rothering <jrother...@codojo.me>
Subject Re: EC2 storage options for C*
Date Thu, 04 Feb 2016 00:09:53 GMT
Just curious here ... when did EBS become OK for C*? Didn't they always
push towards using ephemeral disks?

On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <ben@instaclustr.com> wrote:

> For what it's worth we've tried d2 instances and they encourage terrible
> things like super dense nodes (increases your replacement time). In terms
> of useable storage I would go with gp2 EBS on a m4 based instance.
>
> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <jack.krupansky@gmail.com>
> wrote:
>
>> Ah, yes, the good old days of m1.large.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com>
>> wrote:
>>
>>> A lot of people use the old gen instances (m1 in particular) because
>>> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
>>> Whether or not they’re viable is a decision for each user to make. They’re
>>> very, very commonly used for C*, though. At a time when EBS was not
>>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>>> standard.
>>>
>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>>> compelling argument to use m4 or c4 instead of i2. There exists a company
>>> we know currently testing d2 at scale, though I’m not sure they have much
>>> in terms of concrete results at this time.
>>>
>>> - Jeff
>>>
>>> From: Jack Krupansky
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Monday, February 1, 2016 at 1:55 PM
>>>
>>> To: "user@cassandra.apache.org"
>>> Subject: Re: EC2 storage options for C*
>>>
>>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>>> Dense Storage".
>>>
>>> The remaining question is whether any of the "Previous Generation
>>> Instances" should be publicly recommended going forward.
>>>
>>> And whether non-SSD instances should be recommended going forward as
>>> well. sure, technically, someone could use the legacy instances, but the
>>> question is what we should be recommending as best practice going forward.
>>>
>>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <srobenalt@highwire.org>
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> At the bottom of the instance-types page, there is a link to the
>>>> previous generations, which includes the older series (m1, m2, etc), many
>>>> of which have HDD options.
>>>>
>>>> There are also the d2 (Dense Storage) instances in the current
>>>> generation that include various combos of local HDDs.
>>>>
>>>> The i2 series has good sized SSDs available, and has the advanced
>>>> networking option, which is also useful for Cassandra. The enhanced
>>>> networking is available with other instance types as well, as you'll see
on
>>>> the feature list under each type.
>>>>
>>>> Steve
>>>>
>>>>
>>>>
>>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <
>>>> jack.krupansky@gmail.com> wrote:
>>>>
>>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>>> instances have local magnetic storage - all the other instance types
are
>>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent
Data
>>>>> Access."
>>>>>
>>>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>>>> instance types.
>>>>>
>>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and
>>>>> gp2 only for the "small to medium databases" use case.
>>>>>
>>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)?
>>>>> Is the doc simply for any newly started instances?
>>>>>
>>>>> See:
>>>>> https://aws.amazon.com/ec2/instance-types/
>>>>> http://aws.amazon.com/ebs/details/
>>>>>
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com
>>>>> > wrote:
>>>>>
>>>>>> > My apologies if my questions are actually answered on the video
or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> Virtually all of them are covered.
>>>>>>
>>>>>> > I'm curious where the EBS physical devices actually reside -
are
>>>>>> they in the same rack, the same data center, same availability zone?
I
>>>>>> mean, people try to minimize network latency between nodes, so how
exactly
>>>>>> is EBS able to avoid network latency?
>>>>>>
>>>>>> Not published,and probably not a straight forward answer (probably
>>>>>> have redundancy cross-az, if it matches some of their other published
>>>>>> behaviors). The promise they give you is ‘iops’, with a certain
block size.
>>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>>> interfaces. Like most things in cassandra / cloud, the only way to
know for
>>>>>> sure is to test it yourself and see if observed latency is acceptable
(or
>>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>>
>>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> We tested dozens of instance type/size combinations (literally).
The
>>>>>> best performance was clearly with ebs-optimized instances that also
have
>>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>>
>>>>>> > SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> SSD, GP2 (slide 64)
>>>>>>
>>>>>> > What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Not published, but experiments show that we can hit 10k iops all
day
>>>>>> every day with only trivial noisy neighbor problems, not enough to
impact a
>>>>>> real cluster (slide 58)
>>>>>>
>>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> You can use RAID to get higher IOPS than you’d normally get by
>>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume
– if you
>>>>>> need more than 10k, you can stripe volumes together up to the ebs
network
>>>>>> link max) (hinted at in slide 64)
>>>>>>
>>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean,
>>>>>> with a properly configured Cassandra cluster RF provides HA, so what
is the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that
those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet
link to
>>>>>> EBS network fails, for example). Occasionally instances will have
issues.
>>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>>> “instance retired” or “instance is running on degraded hardware”
events.
>>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible
with
>>>>>> instance store). The assurances are in AWS’ SLA – if the SLA
is
>>>>>> insufficient (and it probably is insufficient), use more than one
AZ and/or
>>>>>> AWS region or cloud vendor.
>>>>>>
>>>>>> > For multi-data center operation, what configuration options
assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> It used to be true that EBS control plane for a given region spanned
>>>>>> AZs. That’s no longer true. AWS asserts that failure modes for
each AZ are
>>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c).
Slide 65
>>>>>>
>>>>>> > In terms of syncing data for the commit log, if the OS call
to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced
at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems
on
>>>>>> which the EBS volumes reside will still guarantee availability of
the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee
of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we
would like
>>>>>> some significant degree of pipelining of data, such as during the
full
>>>>>> processing of flushing memtables, but for the fsync at the end a
solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>>> from failures”. AWS targets something like 0.1% annual failure
rate per
>>>>>> volume and 99.999% availability (slide 66). We believe they’re
exceeding
>>>>>> those goals (at least based with the petabytes of data we have on
gp2
>>>>>> volumes).
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Jack Krupansky
>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>>
>>>>>> To: "user@cassandra.apache.org"
>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>
>>>>>> I'm not a fan of guy - this appears to be the slideshare
>>>>>> corresponding to the video:
>>>>>>
>>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>>
>>>>>> My apologies if my questions are actually answered on the video or
>>>>>> slides, I just did a quick scan of the slide text.
>>>>>>
>>>>>> I'm curious where the EBS physical devices actually reside - are
they
>>>>>> in the same rack, the same data center, same availability zone? I
mean,
>>>>>> people try to minimize network latency between nodes, so how exactly
is EBS
>>>>>> able to avoid network latency?
>>>>>>
>>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>>
>>>>>> SSD or magnetic or does it make any difference?
>>>>>>
>>>>>> What info is available on EBS performance at peak times, when
>>>>>> multiple AWS customers have spikes of demand?
>>>>>>
>>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>>
>>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>>> a properly configured Cassandra cluster RF provides HA, so what is
the
>>>>>> equivalent for EBS? If I have RF=3, what assurance is there that
those
>>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>>
>>>>>> For multi-data center operation, what configuration options assure
>>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>>
>>>>>> In terms of syncing data for the commit log, if the OS call to sync
>>>>>> an EBS volume returns, is the commit log data absolutely 100% synced
at the
>>>>>> hardware level on the EBS end, such that a power failure of the systems
on
>>>>>> which the EBS volumes reside will still guarantee availability of
the
>>>>>> fsynced data. As well, is return from fsync an absolute guarantee
of
>>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>>> including when the two are on different volumes? In practice, we
would like
>>>>>> some significant degree of pipelining of data, such as during the
full
>>>>>> processing of flushing memtables, but for the fsync at the end a
solid
>>>>>> guarantee is needed.
>>>>>>
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.plowe@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Jeff,
>>>>>>>
>>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm
not
>>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>>
>>>>>>>
>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.jirsa@crowdstrike.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Free to choose what you'd like, but EBS outages were also
addressed
>>>>>>>> in that video (second half, discussion by Dennis Opacki).
2016 EBS isn't
>>>>>>>> the same as 2011 EBS.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.plowe@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>>> Ephemeral. GP2 after testing is a viable contender for our
workload. The
>>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>>
>>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.jirsa@crowdstrike.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>>
>>>>>>>>> We tested up to 1M reads/second as well, blowing out
page cache to
>>>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Jirsa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> How about reads? Any differences between read-intensive
and
>>>>>>>>> write-intensive workloads?
>>>>>>>>>
>>>>>>>>> -- Jack Krupansky
>>>>>>>>>
>>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k
iops. Even at 1M
>>>>>>>>>> writes per second on 60 nodes, we didn’t come close
to hitting even 50%
>>>>>>>>>> utilization (10k is more than enough for most workloads).
PIOPS is not
>>>>>>>>>> necessary.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: John Wong
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> For production I'd stick with ephemeral disks (aka
instance
>>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>>> However, for regular small testing/qa cluster, or
something you
>>>>>>>>>> know you want to reload often, EBS is definitely
good enough and we haven't
>>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we
have flush blocked.
>>>>>>>>>>
>>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't
go through
>>>>>>>>>> the video, do you actually use PIOPS or just standard
GP2 in your
>>>>>>>>>> production cluster?
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>>> bryan@blockcypher.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yep, that motivated my question "Do you have
any idea what kind
>>>>>>>>>>> of disk performance you need?". If you need the
performance, its hard to
>>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its
a solid, battle tested
>>>>>>>>>>> configuration. If you don't, though, EBS GP2
will save a _lot_ of headache.
>>>>>>>>>>>
>>>>>>>>>>> Personally, on small clusters like ours (12 nodes),
we've found
>>>>>>>>>>> our choice of instance dictated much more by
the balance of price, CPU, and
>>>>>>>>>>> memory. We're using GP2 SSD and we find that
for our patterns the disk is
>>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>>> jeff.jirsa@crowdstrike.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you have to ask that question, I strongly
recommend m4 or c4
>>>>>>>>>>>> instances with GP2 EBS.  When you don’t
care about replacing a node because
>>>>>>>>>>>> of an instance failure, go with i2+ephemerals.
Until then, GP2 EBS is
>>>>>>>>>>>> capable of amazing things, and greatly simplifies
life.
>>>>>>>>>>>>
>>>>>>>>>>>> We gave a talk on this topic at both Cassandra
Summit and AWS
>>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4
It’s
>>>>>>>>>>>> very much a viable option, despite any old
documents online that say
>>>>>>>>>>>> otherwise.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>>
>>>>>>>>>>>> My company is planning on rolling out a C*
cluster in EC2. We
>>>>>>>>>>>> are thinking about going with ephemeral SSDs.
The question is this: Should
>>>>>>>>>>>> we put two in RAID 0 or just go with one?
We currently run a cluster in our
>>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's
in RAID 0 and we are happy with
>>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Eric
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>> srobenalt@highwire.org <bzavon@highwire.org>
>>>> (office/cell): 916-505-1785
>>>>
>>>> HighWire Press, Inc.
>>>> 425 Broadway St, Redwood City, CA 94063
>>>> www.highwire.org
>>>>
>>>> Technology for Scholarly Communication
>>>>
>>>
>>>
>> --
> Ben Bromhead
> CTO | Instaclustr
> +1 650 284 9692
>

Mime
View raw message