hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lachisis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4382) Container hierarchy in cgroup may remain for ever after the container have be terminated
Date Mon, 23 Nov 2015 07:09:10 GMT

    [ https://issues.apache.org/jira/browse/YARN-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021630#comment-15021630
] 

lachisis commented on YARN-4382:
--------------------------------

If lots of container hierarchys remained, it will make the cpu busy of this node, even when
no jobs are running.

------------------------------------------------------------------------------
   PerfTop:  129889 irqs/sec  kernel:76.3% [100000 cycles],  (all, 16 CPUs)
------------------------------------------------------------------------------

             samples    pcnt   kernel function
             _______   _____   _______________

           117166.00 - 59.1% : tg_shares_up
            35688.00 - 18.0% : _spin_lock_irqsave
            12045.00 -  6.1% : __set_se_shares


> Container hierarchy in cgroup may remain for ever after the container have be terminated
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-4382
>                 URL: https://issues.apache.org/jira/browse/YARN-4382
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.5.2
>            Reporter: lachisis
>
> If we use LinuxContainerExecutor to executor the containers, this question may happens.
> In the common case, when a container run, a corresponding hierarchy will be created in
cgroup dir. And when the container terminate, the hierarchy  will be delete in some seconds(this
time can be configured by yarn.nodemanager.linux-container-executor.cgroups.delete-delay-ms).
> In the code, I find that, CgroupsLCEResource send a signal to kill container process
asynchronously, and in the same time, it will try to delete the container hierarchy  in configured
"delete-delay-ms" times. 
> But if the container process be killed for seconds which large than "delete-delay-ms"
time, the  container hierarchy  will remain for ever.
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message