cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reid Pinchback <>
Subject Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"
Date Wed, 04 Dec 2019 17:22:05 GMT
Probably helps to think of how swap actually functions.  It has a valid place, so long as the
behavior of the kernel and the OOM killer are understood.

You can have a lot of cold pages that have nothing at all to do with C*.  If you look at where
memory goes, it isn’t surprising to see things that the kernel finds it can page out, leaving
RAM for better things.  I’ve seen crond soak up a lot of memory, and Dell’s assorted memory-bloated
tooling, for example. Anything that is truly cold, swap is your friend because those things
are infrequently used… swapping them in and out leaves more memory on average for what you
want.  However, that’s not huge numbers, that could be something like a half gig of RAM
kept routinely free, depending on the assorted tooling you have as a baseline install for

If swap exists to avoid the OOM killer on truly active processes, the returns there diminish
rapidly. Within seconds you’ll find you can’t even ssh into a box to investigate. In something
like a traditional database it’s worth the pain because there are multiple child processes
to the rdbms, and the OOM killer preferentially targets big process families.  Databases can
go into a panic if you toast a child, and you have a full-blown recovery on your hands.  Fortunately
the more mature databases give you knobs for memory tuning, like being able to pin particular
tables in memory if they are critical; anything not pinned (via madvise I believe) can get
tossed when under pressure.

The situation is a bit different with C*.  By design, you have replicas that the clients automatically
find, and things like speculative retry cause processing to skip over the slowpokes. The better-slow-than-dead
argument seems more tenuous to me here than for an rdbms.  And if you have an SLA based on
latency, you’ll never meet it if you have page faults happening during memory references
in the JVM. So if you have swappiness enabled, probably best to keep it tuned low.  That way
a busy C* JVM hopefully is one of the last victims in the race to shove pages to swap.

From: Shishir Kumar <>
Reply-To: "" <>
Date: Wednesday, December 4, 2019 at 8:04 AM
To: "" <>
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

Message from External Sender
Correct. Normally one should avoid this, as performance might degrade, but system will not
die (until process gets paged out).

In production we haven't done this (just changed mmap_index_only). We have an environment
which gets used for customer to train/beta test that grows rapidly. Investing on infra do
not make sense from cost prospective, so swap as option.

But here if environment is up running it will be interesting to understand what is consuming
memory and is infra sized correctly.

On Wed, 4 Dec 2019, 16:13 Hossein Ghiyasi Mehr, <<>>
"3. Though Datastax do not recommended and recommends Horizontal scale, so based on your requirement
alternate old fashion option is to add swap space."
Hi Shishir,
swap isn't recommended by DataStax!

------------------------------------------------------- - A Total Solution for Data Gathering & Analysis

On Tue, Dec 3, 2019 at 5:53 PM Shishir Kumar <<>>
Options: Assuming model and configurations are good and Data size per node less than 1 TB
(though no such Benchmark).

1. Infra scale for memory
2. Try to change disk_access_mode to mmap_index_only.
In this case you should not have any in memory DB tables.
3. Though Datastax do not recommended and recommends Horizontal scale, so based on your requirement
alternate old fashion option is to add swap space.


On Tue, 3 Dec 2019, 15:52 John Belliveau, <<>>

I've only been working with Cassandra for 2 years, and this echoes my experience as well.

Regarding the cache use, I know every use case is different, but have you experimented and
found any performance benefit to increasing its size?

John Belliveau

On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback <<>>
Rahul, if my memory of this is correct, that particular logging message is noisy, the cache
is pretty much always used to its limit (and why not, it’s a cache, no point in using less
than you have).

No matter what value you set, you’ll just change the “reached (….)” part of it.  I
think what would help you more is to work with the team(s) that have apps depending upon C*
and decide what your performance SLA is with them.  If you are meeting your SLA, you don’t
care about noisy messages.  If you aren’t meeting your SLA, then the noisy messages become
sources of ideas to look at.

One thing you’ll find out pretty quickly.  There are a lot of knobs you can turn with C*,
too many to allow for easy answers on what you should do.  Figure out what your throughput
and latency SLAs are, and you’ll know when to stop tuning.  Otherwise you’ll discover
that it’s a rabbit hole you can dive into and not come out of for weeks.

From: Hossein Ghiyasi Mehr <<>>
Reply-To: "<>" <<>>
Date: Monday, December 2, 2019 at 10:35 AM
To: "<>" <<>>
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

Message from External Sender
It may be helpful:<>
It's complex. Simple explanation, cassandra keeps sstables in memory based on chunk size and
sstable parts. It manage loading new sstables to memory based on requests on different sstables
correctly . You should be worry about it (sstables loaded in memory) - A Total Solution for Data Gathering & Analysis

On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy <<>>
Thanks Hossein,

How does the chunks are moved out of memory (LRU?) if it want to make room for new requests
to get chunks?if it has mechanism to clear chunks from cache what causes to cannot allocate
chunk? Can you point me to any documention?

On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr <<>>
Chunks are part of sstables. When there is enough space in memory to cache them, read performance
will increase if application requests it again.

Your real answer is application dependent. For example write heavy applications are different
than read heavy or read-write heavy. Real time applications are different than time series
data environments and ... .

On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy <<>>

We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see this because file_cache_size_mb
by default set to 512MB.

Datastax document recommends to increase the file_cache_size.

We have 32G over all memory allocated 16G to Cassandra. What is the recommended value in my
case. And also when does this memory gets filled up frequent does nodeflush helps in avoiding
this info messages?
View raw message