hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shane Kumpf (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5366) Add support for toggling the removal of completed and failed docker containers
Date Mon, 03 Apr 2017 17:25:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953872#comment-15953872

Shane Kumpf commented on YARN-5366:

Thanks [~vinodkv]! Responses below.

Signal.QUIT handling is very application specific. For e.g, nginx does graceful shutdown while
JVMs do thead dump and don't shut-down at all. We shouldn't stop / rm container for QUIT at

I addressed this in another design document, but here is the jist of it. While it is possible
to do a {{docker kill --signal SIGQUIT}} this is limited in it usefulness and may result in
unexpected behavior. The signal is always sent to PID 1 in the container. Depending on the
image or app type, this may not be the process we want to catch that signal. Alternatively,
users can specify the STOPSIGNAL in the Dockerfile and the user likely has a better understanding
of the implications for that application/image type. Thoughts on how this should be handled?

I think the best we can do is to send the intent to container-executor binary and let it do
stop and rm in one shot so as to save on multiple launches.

IMO, moving more of this logic into c-e complicates matters and doesn't follow what we've
done so far. Nearly all existing DockerCommands execute via c-e as a single Docker CLI command.
If the concern is the performance hit, the Stop command here is a safeguard and should not
get called as the container should be completed. However, you can't rm a container that isn't
stopped, so ensuring it has been stopped is necessary. 

I've created and posted patches to YARN-6366 (Refactor the NodeManager DeletionService to
support additional DeletionTask types) and YARN-6374 (Improve test coverage and add utility
classes for common Docker operations). These are the prerequisites to have docker containers
honor the debug delay.

> Add support for toggling the removal of completed and failed docker containers
> ------------------------------------------------------------------------------
>                 Key: YARN-5366
>                 URL: https://issues.apache.org/jira/browse/YARN-5366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Shane Kumpf
>            Assignee: Shane Kumpf
>              Labels: oct16-medium
>         Attachments: YARN-5366.001.patch, YARN-5366.002.patch, YARN-5366.003.patch, YARN-5366.004.patch,
YARN-5366.005.patch, YARN-5366.006.patch
> Currently, completed and failed docker containers are removed by container-executor.
Add a job level environment variable to DockerLinuxContainerRuntime to allow the user to toggle
whether they want the container deleted or not and remove the logic from container-executor.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message