kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Theo Hultberg <t...@iconara.net>
Subject Re: HDD or SSD or EBS for kafka brokers in Amazon EC2
Date Wed, 03 Jun 2015 04:46:50 GMT
Henry: We run Kafka on the old and trusty m1.xlarge. We avoid EBS
completely, it's network storage that pretends to be local and when the
network, which is AWS' weak spot, acts up EBS is a big liability. It's also
slow and expensive.

Others: Thanks for sharing your experience with the d2's. We have been
considering them for Kafka, but now it sounds like we should wait with that
until they're fixed.

T#

On Wed, Jun 3, 2015 at 1:26 AM, Henry Cai <hcai@pinterest.com.invalid>
wrote:

> Steven,
>
> Do you have the AWS case # (or the Ubuntu bug/case #) when you hit that
> kernel panic issue?
>
> Our company will still be running on AMI image 12.04 for a while, I will
> see whether the fix was also ported onto Ubuntu 12.04
>
> On Tue, Jun 2, 2015 at 2:53 PM, Steven Wu <stevenz3wu@gmail.com> wrote:
>
> > now I remember we had same kernel panic issue in the first week of D2
> > rolling-out. then AWS fixed it and we haven't seen any issue since. try
> > Ubuntu 14.04 and see if it resolves your remaining kernel/instability
> issue.
> >
> > On Tue, Jun 2, 2015 at 2:30 PM, Wes Chow <wes@chartbeat.com> wrote:
> >
> >>
> >>   Daniel Nelson <daniel.nelson@vungle.com>
> >>  June 2, 2015 at 4:39 PM
> >>
> >> On Jun 2, 2015, at 1:22 PM, Steven Wu <stevenz3wu@gmail.com> <
> stevenz3wu@gmail.com> wrote:
> >>
> >> can you elaborate what kind of instability you have encountered?
> >>
> >> We have seen the nodes become completely non-responsive. Usually they
> get rebooted automatically after 10-20 minutes, but occasionally they get
> stuck for days in a state where they cannot be rebooted via the Amazon APIs.
> >>
> >>
> >> Same here. It was worse right after d2 launch. We had 6 out of 9 servers
> >> die within 10 hours after spinning them up. Amazon rolled out a fix, but
> >> we're still seeing similar issues, though not nearly as bad. The first
> fix
> >> was for something network related, and apparently sending lots of data
> >> through the instances caused a kernel panic on the host. We have no
> >> information yet about the current issue.
> >>
> >> Wes
> >>
> >>   Steven Wu <stevenz3wu@gmail.com>
> >>  June 2, 2015 at 4:22 PM
> >> Wes/Daniel,
> >>
> >> can you elaborate what kind of instability you have encountered?
> >>
> >> we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in
> >> the announcement, they did mention using Ubuntu 14.04 for better disk
> >> throughput. not sure whether 14.04 also addresses any instability issue
> you
> >> encountered or not.
> >>
> >> Thanks,
> >> Steven
> >>
> >> In order to ensure the best disk throughput performance from your D2
> instances
> >> on Linux, we recommend that you use the most recent version of the
> Amazon
> >> Linux AMI, or another Linux AMI with a kernel version of 3.8 or later.
> The
> >> D2 instances provide the best disk performance when you use a Linux
> >> kernel that supports Persistent Grants – an extension to the Xen block
> ring
> >> protocol that significantly improves disk throughput and scalability.
> The
> >> following Linux AMIs support this feature:
> >>
> >>    - Amazon Linux AMI 2015.03 (HVM)
> >>    - Ubuntu Server 14.04 LTS (HVM)
> >>    - Red Hat Enterprise Linux 7.1 (HVM)
> >>    - SUSE Linux Enterprise Server 12 (HVM)
> >>
> >>
> >>
> >>
> >>   Daniel Nelson <daniel.nelson@vungle.com>
> >>  June 2, 2015 at 2:42 PM
> >>
> >> Do you have any workarounds for the d2 issues? We’ve been using them for
> >> our Kafkas too, and ran into the instability. We’re on Ubuntu 12.04 and
> >> plan to try on 14.04 with the latest HWE to see if that helps any.
> >>
> >> Thanks!
> >>   Wes Chow <wes@chartbeat.com>
> >>  June 2, 2015 at 1:39 PM
> >>
> >> We have run d2 instances with Kafka. They're currently unstable --
> Amazon
> >> confirmed a host issue with d2 instances that gets tickled by a Kafka
> >> workload yesterday. Otherwise, it seems the d2 instance type is ideal
> as it
> >> gets an enormous amount of disk throughput and you'll likely be network
> >> bottlenecked.
> >>
> >> Wes
> >>
> >>
> >>   Steven Wu <stevenz3wu@gmail.com>
> >>  June 2, 2015 at 1:07 PM
> >> EBS (network attached storage) has got a lot better over the last a few
> >> years. we don't quite trust it for kafka workload.
> >>
> >> At Netflix, we were going with the new d2 instance type (HDD). our
> >> perf/load testing shows it satisfy our workload. SSD is better in
> latency
> >> curve but pretty comparable in terms of throughput. we can use the extra
> >> space from HDD for longer retention period.
> >>
> >> On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <hcai@pinterest.com.invalid>
> >> <hcai@pinterest.com.invalid>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message