mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James DeFelice (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-3363) custom executor's child process intermittently leaks to be a child of slave
Date Sun, 06 Sep 2015 12:07:45 GMT

    [ https://issues.apache.org/jira/browse/MESOS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732359#comment-14732359
] 

James DeFelice commented on MESOS-3363:
---------------------------------------

No I did not have that namespace enabled when I observed the problem.





-- 
James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)


> custom executor's child process intermittently leaks to be a child of slave
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-3363
>                 URL: https://issues.apache.org/jira/browse/MESOS-3363
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.23.0
>         Environment: {code}
> vagrant@node-1:~$ uname -a
> Linux node-1 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64 x86_64
x86_64 GNU/Linux
> vagrant@node-1:~$ dpkg -l | grep -e mesos
> ii  mesos                               0.23.0-1.0.ubuntu1404            amd64      
 Cluster resource manager with efficient resource isolation
> {code}
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> I was testing a custom executor implementation that manages the life cycle of multiple
child processes. When the executor is SIGTERM'd it sends a SIGTERM to each child process and
then self-terminates.
> In some cases, the child processes do not die, even through the parent process (the custom
executor) does. Instead the child procs are re-parented to the slave process where they continue
to live on indefinitely.
> My custom executor is written in Go, and I've found a useful Go/Linux-specific setting
that allows me to configure a signal to be sent to child procs upon the death of the calling
thread in the parent. (see https://golang.org/src/syscall/exec_linux.go?s=6285:6843#1 for
details). I've since configured the custom executor to specify that a SIGKILL be sent to all
child procs upon termination of the executor (parent) process: child procs are still sent
a SIGTERM upon receipt of such by the executor, but the SIGKILL upon executor death now acts
as a fallback.
> Since implementing the above work-around I have not been able to reproduce the problem
as previously described. This particular syscall is implemented in very few OS's (the Golang
hack only supports Linux) so I'm not sure how I'd go about something similar on Windows, OS
X, BSD, etc.
> It seems like mesos should take on the responsibility to ensure that when an executor
is killed, all of it's child procs are also eventually killed. Given that it's an intermittent
and hard to reproduce problem, I'm assuming that mesos *does* attempt to ensure executor child
proc death, but the that the implementation is racy/leaky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message