mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Qian Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-9076) Mesos agent will be wrongly treated as unknown orphaned container if `--cgroups_root` has a leading slash
Date Mon, 16 Jul 2018 07:07:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544871#comment-16544871
] 

Qian Zhang commented on MESOS-9076:
-----------------------------------

The root cause of this issue is, a leading slash in {{--cgroups_root}} will make [this check|https://github.com/apache/mesos/blob/1.6.0/src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp#L241] false,
and then we will treat the agent cgroup as an unknown orphaned container.

> Mesos agent will be wrongly treated as unknown orphaned container if `--cgroups_root`
has a leading slash
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-9076
>                 URL: https://issues.apache.org/jira/browse/MESOS-9076
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent, containerization
>            Reporter: Qian Zhang
>            Priority: Major
>
> When agent is started with the following flags:
>  * --cgroups_root=<a value with a leading slash>, e.g., {{/mesos}}.
>  * --agent_subsystems=<some cgroups subsystems>, e.g., {{memory}}.
>  * --isolation=<some cgroups subsystems which contain the ones specified in --agent_subsystems>,
e.g., {{cgroups/cpu,cgroups/mem}}.
> we will see the agent will be treated as an unknown orphaned container:
> {code:java}
> I0716 14:55:59.969400 3892 containerizer.cpp:718] Recovering Mesos containers
> I0716 14:55:59.970304 3888 linux_launcher.cpp:299] Recovering Linux launcher
> I0716 14:55:59.973687 3894 containerizer.cpp:1025] Recovering isolators
> W0716 14:55:59.978680 3891 cgroups.cpp:352] Couldn't find the cgroup '/mesos/slave' in
hierarchy '/sys/fs/cgroup/cpu,cpuacct' for container slave
> I0716 14:55:59.981603 3892 memory.cpp:478] Started listening for OOM events for container
slave
> I0716 14:55:59.983178 3892 memory.cpp:590] Started listening on 'low' memory pressure
events for container slave
> I0716 14:55:59.985761 3892 memory.cpp:590] Started listening on 'medium' memory pressure
events for container slave
> I0716 14:55:59.987036 3892 memory.cpp:590] Started listening on 'critical' memory pressure
events for container slave
> I0716 14:55:59.988668 3893 cgroups.cpp:320] Cleaning up unknown orphaned container slave{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message