cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avi Kivity <...@scylladb.com>
Subject Re: CS process killed by kernel OOM
Date Mon, 06 Feb 2017 10:53:37 GMT

On 01/26/2017 07:36 AM, Benjamin Roth wrote:
> Hi there,
>
> We installed 2 new nodes these days. They run on ubuntu (Ubuntu 
> 16.04.1 LTS) with kernel 4.4.0-59-generic. On these nodes (and only on 
> these) CS gets killed by the kernel due to OOM. It seems very strange 
> to me because, CS only takes roughly 20GB (out of 128GB), most of RAM 
> is allocated to page cache.
>
> Top looks typically like this:
> KiB Mem : 13191691+total,  1974964 free, 20278184 used, 
> 10966376+buff/cache
> KiB Swap:        0 total,        0 free,        0 used. 11051503+avail Mem
>
> This is what kern.log says:
> https://gist.github.com/brstgt/0f1aa6afb558a56d1cadce958db46cf9
>
> Has anyone encountered sth like this before?
>

2017-01-26T03:10:45.679458+00:00 cas10 kernel: [52226.449989] Node 0 
Normal: 33850*4kB (UMEH) 8*8kB (UMH) 1*16kB (H) 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 135480kB
2017-01-26T03:10:45.679460+00:00 cas10 kernel: [52226.449995] Node 1 
Normal: 34213*4kB (UME) 176*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 138260kB


There is plenty of free memory left (33850+34213)*4kB = 270 MB, but it 
is fragmented into 4k and 8k blocks, while the kernel is trying to 
allocate 16kB.  Still, the kernel could have evicted some page cache or 
swapped out anonymous memory.  You should report this to lkml, it is a 
kernel bug.



> -- 
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com <http://www.jaumo.com>
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Mime
View raw message