On Fri, Dec 28, 2018, 4:23 PM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:
On Fri, Dec 7, 2018 at 12:43 PM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:

After a fresh JVM start the memory allocation looks roughly like this:

             total       used       free     shared    buffers     cached
Mem:           14G        14G       173M       1.1M        12M       3.2G
-/+ buffers/cache:        11G       3.4G
Swap:           0B         0B         0B

Then, within a number of days, the allocated disk cache shrinks all the way down to unreasonable numbers like only 150M.  At the same time "free" stays at the original level and "used" grows all the way up to 14G.  Shortly after that the node becomes unavailable because of the IO and ultimately after some time the JVM gets killed.

Most importantly, the resident size of JVM process stays at around 11-12G all the time, like it was shortly after the start.  How can we find where the rest of the memory gets allocated?  Is it just some sort of malloc fragmentation?

For the ones following along at home, here's what we ended up with so far:

0. Switched to the next biggest EC2 instance type, r4.xlarge: and the symptoms are gone.  Our bill is dominated by the price EBS storage, so this is much less than 2x increase in total.

1. We've noticed that increased memory usage correlates with the number of SSTables on disk.  When the number of files on disk decreases, available memory increases.  This leads us to think that extra memory allocation is indeed due to use of mmap.  Not clear how we could account for that.

2. Improved our monitoring to include number of files (via total - free inodes).

Given the cluster's resource utilization, it still feels like r4.large would be a good fit, if only we could figure out those few "missing" GB of RAM. ;-)