hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Brennan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
Date Mon, 20 Aug 2018 15:12:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586060#comment-16586060
] 

Jim Brennan commented on YARN-8648:
-----------------------------------

Thanks [~eyang]!  My main concern about the minimal fix is the security aspect, since we
will need to add an option to container-executor to tell it to delete all cgroups with a particular
name as root (since docker will create them as root).

I think this is mitigated if we use the "cgroup" section of the container-exectutor.cfg to
constrain it.  This is currently used to enable updating params, but I think it could be
used for this as well.    It already defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and
the YARN_HIERARCHY (e.g, hadoop-yarn).  We could either add another config parameter to define
the list of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can parse
/proc/mounts to determine the full list.  I think it's safer to add the config parameter.

I will start working on this version unless there are objections?

> Container cgroups are leaked when using docker
> ----------------------------------------------
>
>                 Key: YARN-8648
>                 URL: https://issues.apache.org/jira/browse/YARN-8648
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>              Labels: Docker
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups for all resources
on the system, not just for cpu.  For instance, if the {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}},
the nodemanager will create a cgroup for each container under {{/sys/fs/cgroup/cpu/hadoop-yarn}}.
 In the docker case, we pass this path via the {{--cgroup-parent}} command line argument.
  Docker then creates a cgroup for the docker container under that, for instance: {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}} cgroup, and the
nodemanager cleans up the {{container_id}} cgroup,   All is good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every resource under
{{/sys/fs/cgroup}}.  On the rhel7 system I am using, these are: blkio, cpuset, devices, freezer,
hugetlb, memory, net_cls, net_prio, perf_event, and systemd.    So for instance, docker creates
{{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but it only cleans
up the leaf cgroup {{docker_container_id}}.  Nobody cleans up the {{container_id}} cgroups
for these other resources.  On one of our busy clusters, we found > 100,000 of these leaked
cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to repro with current
hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message