mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Mahler (JIRA)" <>
Subject [jira] [Commented] (MESOS-943) Provide an abstraction for asynchronous launching of subprocesses.
Date Fri, 24 Jan 2014 21:25:38 GMT


Benjamin Mahler commented on MESOS-943:

Thanks for bringing this up Nikita!

For (3) above:  From the libprocess point of view, we don't want to reap all children by default.
This is done in some other libraries, like libev, and it appears to be fairly common for users
to override this behavior (see Tim's comments in MESOS-895).

For (1) above: Our code is designed to be asynchronous which means blocking worker threads
is problematic. So, we'd like to provide an asynchronous notification mechanism for reaping.

(2) Is the modified approach I posted in my reviews, but as you mentioned it comes with the
downside of a busy loop. I'll be including some comments in the code to address how to optimize
this, which brings up a 4th option:

4. Providing a process::reap(pid_t) utility which uses a thread for each pid watched. Each
thread will make a blocking call to waitpid(). This avoids the busy loop, but I will hold
off on making such an optimization for now.

> Provide an abstraction for asynchronous launching of subprocesses.
> ------------------------------------------------------------------
>                 Key: MESOS-943
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
> This has come up during [~idownes] changes to add containerization.
> We would like to be able to run commands asynchronously like:
> {{curl -O}}
> Currently, there is not an easy way to do this while having:
> 1. A Future handle on the exit status of the subprocess.
> 2. The means to 'discard' the future and consequently kill the subprocess (e.g. stalled
hadoop command).
> 3. Handles to stdin, stdout, stderr of the subprocess.
> The first issue is that we need to re-work the Reaper to not reap _all_ subprocesses.
Rather, we need to allow other components to reap their own forked subprocesses without the
slave's Reaper "stealing" the exit status information. I've proposed that we move the Reaper
into libprocess initially with the only change being to reap the desired pids. (We can optimize
this later using a per-pid blocking thread or SIGCHLD).
> One concern is that if we 'leak' child processes by accidentally not reaping, we may
fill the process table with zombie processes. However, we have tight control over where our
code performs forks, and can enforce proper reaping.

This message was sent by Atlassian JIRA

View raw message