mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Barton <barton.to...@gmail.com>
Subject Re: Problems with OOM
Date Fri, 26 Sep 2014 13:15:04 GMT
Just to make sure, all slaves are running with:

--isolation='cgroups/cpu,cgroups/mem'

Is there something suspicious in mesos slave logs?

On 26 September 2014 13:20, Stephan Erb <stephan.erb@blue-yonder.com> wrote:

>  Hi everyone,
>
> I am having issues with the cgroups isolation of Mesos. It seems like
> tasks are prevented from allocating more memory than their limit. However,
> they are never killed.
>
>    - My scheduled task allocates memory in a tight loop. According to
>    'ps', once its memory requirements are exceeded it is not killed, but ends
>    up in the state D ("uninterruptible sleep (usually IO)").
>    - The task is still considered running by Mesos.
>    - There is no indication of an OOM in dmesg.
>    - There is neither an OOM notice nor any other output related to the
>    task in the slave log.
>    - According to htop, the system load is increased with a significant
>    portion of CPU time spend within the kernel. Commonly the load is so high
>    that all zookeeper connections time out.
>
> I am running Aurora and Mesos 0.20.1 using the cgroups isolation on Debian
> 7 (kernel 3.2.60-1+deb7u3). .
>
> Sorry for the somewhat unspecific error description. Still, anyone an idea
> what might be wrong here?
>
> Thanks and Best Regards,
> Stephan
>

Mime
View raw message