cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Bromhead <>
Subject Re: Back to the futex()? :(
Date Tue, 09 Feb 2016 19:11:43 GMT
I'm not surprised that when you profile cassandra you are seeing some lock
contention, particularly given its SEDA architecture, as there is a lot of
waiting that threads end up doing while requests make their way through the
various stages.


So I would say the thread_wait issue is a red herring in this case given it
will be inherent for most Cassandra deployments... the caveat is that you
are running 3.2.1 which is a thoroughly new version of Cassandra that may
have a new bug and I'm not sure how many people here have experience with
it. Especially given that the new tick-tock approach makes it hard to judge
when a release is ready for prime time.

Otherwise follow the good folk at crowdstrike for getting good performance
out of EBS (
They have done all the hard work for the rest of us.

Reduce your JVM heap size to something closer to 8GB, given that your
cluster hasn't seen a production workload I wouldn't worry about tuning
heap etc unless you see GC pressure in the logs. You don't want to spend a
lot of time tuning for backloading when the actual traffic will be / could
be different.

The performance you are getting is roughly on par to what we have seen with
some early benchmarking of EBS volumes (,
but with machines half the size. We decided to go a slightly different path
and use m4.xlarges we are always playing with different configurations to
see what works best.

On Sat, 6 Feb 2016 at 16:50 Will Hayworth <> wrote:

> Additionally: this isn't the futex_wait bug (or at least it shouldn't
> be?). Amazon says
> <> that was
> fixed several kernel versions before mine, which
> is 4.1.10-17.31.amzn1.x86_64. And the reason my heap is so large is
> because, per CASSANDRA-9472, we can't use offheap until 3.4 is released.
> Will
> ___________________________________________________________
> Will Hayworth
> Developer, Engagement Engine
> Atlassian
> My pronoun is "they". <>
> On Sat, Feb 6, 2016 at 3:28 PM, Will Hayworth <>
> wrote:
>> *tl;dr: other than CAS operations, what are the potential sources of lock
>> contention in C*?*
>> Hi all! :) I'm a novice Cassandra and Linux admin who's been preparing a
>> small cluster for production, and I've been seeing something weird. For
>> background: I'm running 3.2.1 on a cluster of 12 EC2 m4.2xlarges (32 GB
>> RAM, 8 HT cores) backed by 3.5 TB GP2 EBS volumes. Until late yesterday,
>> that was a cluster of 12 m4.xlarges with 3 TB volumes. I bumped it because
>> while backloading historical data I had been seeing awful throughput (20K
>> op/s at CL.ONE). I'd read through Al Tobey's *amazing* C* tuning guide
>> <> once
>> or twice before but this time I was careful and fixed a bunch of defaults
>> that just weren't right, in cassandra.yaml/JVM options/block device
>> parameters. Folks on IRC were super helpful as always (hat tip to Jeff
>> Jirsa in particular) and pointed out, for example, that I shouldn't be
>> using DTCS for loading historical data--heh. After changing to LTCS,
>> unbatching my writes* and reserving a CPU core for interrupts and fixing
>> the clocksource to TSC, I finally hit 80K early this morning. Hooray! :)
>> Now, my question: I'm still seeing a *ton* of blocked processes in the
>> vmstats, anything from 2 to 9 per 10 second sample period--and this is
>> before EBS is even being hit! I've been trying in vain to figure out what
>> this could be--GC seems very quiet, after all. On Al's page's advice, I've
>> been running strace and, indeed, I've been seeing *tens of thousands of
>> futex() calls* in periods of 10 or 20 seconds. What eludes me is *where* this
>> lock contention is coming from. I'm not using LWTs or performing CAS
>> operations of which I'm aware. Assuming this isn't a red herring, what
>> gives?
>> Sorry for the essay--I just wanted to err on the side of more
>> context--and *thank you* for any advice you'd like to offer,
>> Will
>> P.S. More background if you'd like--I'm running on Amazon Linux 2015.09,
>> using jemalloc 3.6, JDK 1.8.0_65-b17. Here <> is
>> my cassandra.yaml and here <> are my JVM
>> args. I realized I neglected to adjust memtable_flush_writers as I was
>> writing this--so I'll get on that. Aside from that, I'm not sure what to
>> do. (Thanks, again, for reading.)
>> * They were batched for consistency--I'm hoping to return to using them
>> when I'm back at normal load, which is tiny compared to backloading, but
>> the impact on performance was eye-opening.
>> ___________________________________________________________
>> Will Hayworth
>> Developer, Engagement Engine
>> Atlassian
>> My pronoun is "they". <>
> --
Ben Bromhead
CTO | Instaclustr <>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer

View raw message