cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Hayworth <>
Subject Re: Back to the futex()? :(
Date Tue, 09 Feb 2016 19:33:26 GMT
Thanks for the links, Ben--I'll be sure to give those a read. And yeah,
followed the Crowdstrike presentation in detail (including the stuff they
called out on Al Tobey's page). Again, the reason for the huge heap is that
otherwise my memtable can't actually fit in memory (no off-heap until 3.4),
but your point about tuning before true prod is well taken. :) (And I read
your post, too--the reason we started with m4.xlarges is in large part
because you all made it work.)

Nate--the RF is taken care of, thanks (otherwise I've seen issues where my
code can't log in to a given node, which makes sense) and, furthermore, I
ran a repair after doing all the initial loading. I'm not doing dynamic
permissions (though I'm hoping to use Vault <> to
generate short-lived user/password combinations soon), so I'll be sure to
adjust permissions_validity_in_ms.

Thank you both so much for your help!

Will Hayworth
Developer, Engagement Engine

My pronoun is "they". <>

On Tue, Feb 9, 2016 at 11:25 AM, Nate McCall <> wrote:

> I noticed you have authentication enabled. Make sure you set the following:
> - the replication factor for the system_auth keyspace should equal the
> number of nodes
> - permissions_validity_in_ms is a permission cache timeout. If you are not
> doing dynamic permissions or creating/revoking frequently, turn this WAY up
> May not be the immediate reason, but the above are definitely not helping
> if set at defaults.
> On Sat, Feb 6, 2016 at 6:49 PM, Will Hayworth <>
> wrote:
>> Additionally: this isn't the futex_wait bug (or at least it shouldn't
>> be?). Amazon says
>> <> that was
>> fixed several kernel versions before mine, which
>> is 4.1.10-17.31.amzn1.x86_64. And the reason my heap is so large is
>> because, per CASSANDRA-9472, we can't use offheap until 3.4 is released.
>> Will
>> ___________________________________________________________
>> Will Hayworth
>> Developer, Engagement Engine
>> Atlassian
>> My pronoun is "they". <>
>> On Sat, Feb 6, 2016 at 3:28 PM, Will Hayworth <>
>> wrote:
>>> *tl;dr: other than CAS operations, what are the potential sources of
>>> lock contention in C*?*
>>> Hi all! :) I'm a novice Cassandra and Linux admin who's been preparing a
>>> small cluster for production, and I've been seeing something weird. For
>>> background: I'm running 3.2.1 on a cluster of 12 EC2 m4.2xlarges (32 GB
>>> RAM, 8 HT cores) backed by 3.5 TB GP2 EBS volumes. Until late yesterday,
>>> that was a cluster of 12 m4.xlarges with 3 TB volumes. I bumped it because
>>> while backloading historical data I had been seeing awful throughput (20K
>>> op/s at CL.ONE). I'd read through Al Tobey's *amazing* C* tuning guide
>>> <> once
>>> or twice before but this time I was careful and fixed a bunch of defaults
>>> that just weren't right, in cassandra.yaml/JVM options/block device
>>> parameters. Folks on IRC were super helpful as always (hat tip to Jeff
>>> Jirsa in particular) and pointed out, for example, that I shouldn't be
>>> using DTCS for loading historical data--heh. After changing to LTCS,
>>> unbatching my writes* and reserving a CPU core for interrupts and fixing
>>> the clocksource to TSC, I finally hit 80K early this morning. Hooray! :)
>>> Now, my question: I'm still seeing a *ton* of blocked processes in the
>>> vmstats, anything from 2 to 9 per 10 second sample period--and this is
>>> before EBS is even being hit! I've been trying in vain to figure out what
>>> this could be--GC seems very quiet, after all. On Al's page's advice, I've
>>> been running strace and, indeed, I've been seeing *tens of thousands of
>>> futex() calls* in periods of 10 or 20 seconds. What eludes me is *where* this
>>> lock contention is coming from. I'm not using LWTs or performing CAS
>>> operations of which I'm aware. Assuming this isn't a red herring, what
>>> gives?
>>> Sorry for the essay--I just wanted to err on the side of more
>>> context--and *thank you* for any advice you'd like to offer,
>>> Will
>>> P.S. More background if you'd like--I'm running on Amazon Linux 2015.09,
>>> using jemalloc 3.6, JDK 1.8.0_65-b17. Here
>>> <> is my cassandra.yaml and here
>>> <> are my JVM args. I realized I neglected
>>> to adjust memtable_flush_writers as I was writing this--so I'll get on
>>> that. Aside from that, I'm not sure what to do. (Thanks, again, for
>>> reading.)
>>> * They were batched for consistency--I'm hoping to return to using them
>>> when I'm back at normal load, which is tiny compared to backloading, but
>>> the impact on performance was eye-opening.
>>> ___________________________________________________________
>>> Will Hayworth
>>> Developer, Engagement Engine
>>> Atlassian
>>> My pronoun is "they". <>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting

View raw message