cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reid Pinchback <>
Subject Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"
Date Tue, 03 Dec 2019 16:43:26 GMT
John, anything I’ll say will be as a collective ‘we’ since it has been a team effort
here at Trip, and I’ve just been the hired gun to help out a bit. I’m more of a Postgres
and Java guy so filter my answers accordingly.

I can’t say we saw as much relevance to tuning chunk cache size, as we did to do everything
possible to migrate things off-heap.  I haven’t worked with 2.x so I don’t know how much
these options changed, but in 3.11.x anyways, you definitely can migrate a fair bit off-heap.
 Our first use case was 3 9’s sensitive on latency, which turns out to be a rough go for
C* particularly if the data model is a bit askew from C*’s sweet spot, as was true for us.
 The deeper merkle trees that were introduced somewhere I think in the 3.0.x series, that
was a bane of our existence, we back-patched the 4.0 work to tune the tree height so that
we weren’t OOMing nodes during reaper repair runs.

As to Shishir’s notion of using swap, because latency mattered to us, we had RAM headroom
on the boxes.  We couldn’t use it all without pushing on something that was hurting us on
3 9’s.  C* is like this over-constrained problem space when it comes to tuning, poking in
one place resulted in a twitch somewhere else, and we had to see which twitches worked out
in our favour. If, like us, you have RAM headroom, you’re unlikely to care about swap for
obvious reasons.  All you really need is enough room for the O/S file buffer cache.

Tuning related to I/O and file buffer cache mattered a fair bit.  As did GC tuning obviously.
 Personally, if I were to look at swap as helpful, I’d be debating with myself if the sstables
should just remain uncompressed in the first place.  After all, swap space is disk space so
holding compressed+uncompressed at the same time would only make sense if the storage footprint
was large but the hot data in use was routinely much smaller… yet stuck around long enough
in a cold state that the kernel would target it to swap out.  That’s a lot of if’s to
line up to your benefit.  When it comes to a system running based on garbage collection, I
get skeptical of how effectively the O/S will determine what is good to swap. Most of the
JVM memory in C* churns at a rate that you wouldn’t want swap i/o to combine with if you
cared about latency.  Not everybody cares about tight variance on latency though, so there
can be other rationales for tuning that would result in different conclusions from ours.

I might have more definitive statements to make in the upcoming months, I’m in the midst
of putting together my own test cluster for more controlled analysis on C* and Kafka tuning.
Tuning live environments I’ve found makes it hard to control the variables enough for my
satisfaction. It can feel like a game of empirical whack-a-mole.

From: Shishir Kumar <>
Reply-To: "" <>
Date: Tuesday, December 3, 2019 at 9:23 AM
To: "" <>
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

Message from External Sender
Options: Assuming model and configurations are good and Data size per node less than 1 TB
(though no such Benchmark).

1. Infra scale for memory
2. Try to change disk_access_mode to mmap_index_only.
In this case you should not have any in memory DB tables.
3. Though Datastax do not recommended and recommends Horizontal scale, so based on your requirement
alternate old fashion option is to add swap space.


On Tue, 3 Dec 2019, 15:52 John Belliveau, <<>>

I've only been working with Cassandra for 2 years, and this echoes my experience as well.

Regarding the cache use, I know every use case is different, but have you experimented and
found any performance benefit to increasing its size?

John Belliveau

On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback <<>>
Rahul, if my memory of this is correct, that particular logging message is noisy, the cache
is pretty much always used to its limit (and why not, it’s a cache, no point in using less
than you have).

No matter what value you set, you’ll just change the “reached (….)” part of it.  I
think what would help you more is to work with the team(s) that have apps depending upon C*
and decide what your performance SLA is with them.  If you are meeting your SLA, you don’t
care about noisy messages.  If you aren’t meeting your SLA, then the noisy messages become
sources of ideas to look at.

One thing you’ll find out pretty quickly.  There are a lot of knobs you can turn with C*,
too many to allow for easy answers on what you should do.  Figure out what your throughput
and latency SLAs are, and you’ll know when to stop tuning.  Otherwise you’ll discover
that it’s a rabbit hole you can dive into and not come out of for weeks.

From: Hossein Ghiyasi Mehr <<>>
Reply-To: "<>" <<>>
Date: Monday, December 2, 2019 at 10:35 AM
To: "<>" <<>>
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

Message from External Sender
It may be helpful:<>
It's complex. Simple explanation, cassandra keeps sstables in memory based on chunk size and
sstable parts. It manage loading new sstables to memory based on requests on different sstables
correctly . You should be worry about it (sstables loaded in memory) - A Total Solution for Data Gathering & Analysis

On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy <<>>
Thanks Hossein,

How does the chunks are moved out of memory (LRU?) if it want to make room for new requests
to get chunks?if it has mechanism to clear chunks from cache what causes to cannot allocate
chunk? Can you point me to any documention?

On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr <<>>
Chunks are part of sstables. When there is enough space in memory to cache them, read performance
will increase if application requests it again.

Your real answer is application dependent. For example write heavy applications are different
than read heavy or read-write heavy. Real time applications are different than time series
data environments and ... .

On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy <<>>

We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see this because file_cache_size_mb
by default set to 512MB.

Datastax document recommends to increase the file_cache_size.

We have 32G over all memory allocated 16G to Cassandra. What is the recommended value in my
case. And also when does this memory gets filled up frequent does nodeflush helps in avoiding
this info messages?
View raw message