mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Erb <>
Subject Re: Problems with OOM
Date Mon, 06 Oct 2014 16:56:29 GMT

I am still facing the same issue:

  * My process keeps allocating memory until all available system memory
    is used, but it is never killed. Its sandbox is limited to x00 MB
    but it ends up using several GB.
  * There is no OOM or cgroup related entry in dmesg (beside the
    initialization, i.e., "Initializing cgroup subsys memory"...)
  * The slave log contains nothing suspicious (see the attached logfile)

Updating my Debian kernel from 3.2 to a backported 3.16 kernel did not 
help. The system is more responsive under load, but the OOM killer is 
still not triggered. I haven't tried running kernelshark on any of these 
kernels, yet.

My used slave command line: /usr/local/sbin/mesos-slave 
--master=zk://test-host:2181/mesos --log_dir=/var/log/mesos 
--cgroups_limit_swap --isolation=cgroups/cpu,cgroups/mem 
--work_dir=/var/lib/mesos --attributes=host:test-host;rack:unspecified

Any more ideas?


On 27.09.2014 19:34, CCAAT wrote:
> On 09/26/14 06:20, Stephan Erb wrote:
>> Hi everyone,
>> I am having issues with the cgroups isolation of Mesos. It seems like
>> tasks are prevented from allocating more memory than their limit.
>> However, they are never killed.
>> I am running Aurora and Mesos 0.20.1 using the cgroups isolation on
>> Debian 7 (kernel 3.2.60-1+deb7u3). .
> Maybe a newer kernel might help?  I've poked around for some 
> suggestions on the  kernel-configuration file for servers running 
> mesos, but nobody is talking about how they "tweak" their kernel 
> settings, yet.
> Here's a good article on default shared memory limits:
> [1]
> Also, I'm not sure if OOM-Killer works on kernel space problems
> where memory is grabbed up continuously by the kernel. That may
> not even be your problem. I know OOM-killer works on userspace
> memory problems.
> Kernelshark is your friend....
> hth,
> James

View raw message