hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sidharta Seethana (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7
Date Wed, 03 Jun 2015 00:10:52 GMT

    [ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570037#comment-14570037

Sidharta Seethana commented on YARN-2194:

There are two different issues here : 

* container-executor binary invocation uses ‘,’ as a separator when supplying a list of
paths - which breaks when the path contains ‘,’
* cpu,cpuacct are mounted together by default on RHEL7 

Now, for the latter issue : In {{CgroupsLCEResourcesHandler}}, the following steps occur :

* If the {{yarn.nodemanager.linux-container-executor.cgroups.mount}} switch is enabled , the
‘cpu’ controller is explicitly mounted at the specified path. 
* (irrespective of the state of the switch) The {{/proc/mounts}} file (possibly updated by
the previous step) is subsequently parsed to determine the mount locations for the various
cgroup controllers - this parsing code seems to be correct even if cpu and cpuacct are mounted
in one location.

So, the thing we need to fix is the separator issue and we should be good.  The important
thing to remember is that there are *two* cgroups implementation classes ( {{CgroupsLCEResourcesHandler}}
and {{CGroupsHandlerImpl}} ). Hopefully, this will be addressed soon ( YARN-3542 ) - or we
risk divergence. 

> Cgroups cease to work in RHEL7
> ------------------------------
>                 Key: YARN-2194
>                 URL: https://issues.apache.org/jira/browse/YARN-2194
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>            Priority: Critical
>         Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch
> In RHEL7, the CPU controller is named "cpu,cpuacct". The comma in the controller name
leads to container launch failure. 
> RHEL7 deprecates libcgroup and recommends the user of systemd. However, systemd has certain
shortcomings as identified in this JIRA (see comments). 
> This JIRA only fixes the failure, and doesn't try to use systemd.

This message was sent by Atlassian JIRA

View raw message