hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shane Kumpf (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4759) Revisit signalContainer() for docker containers
Date Thu, 14 Apr 2016 21:08:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241924#comment-15241924
] 

Shane Kumpf commented on YARN-4759:
-----------------------------------

After considering the options for ensuring graceful stop of processes that require special
signal handling, I believe this shouldn't be left up to signalContainer or even YARN. Users
with containers that have special signal handling needs should understand how docker manages
signaling on docker stop. It is possible to specify the signal via docker run or the Dockerfile,
and users should do so if they require it.

Given the above, the approach I've taken is as follows:

1) For container liveliness checks using the null signal, run kill -0 on the container's PID
1 from the host. We already get the appropriate PID via docker inspect in container-executor.
This is the same as how DefaultLinuxContainerRuntime handles liveliness checks.
2) For any other signal, call docker stop on the docker container.

If a network container is requested, docker stop will be called on it as well.

I'm working on a patch that does the above.

> Revisit signalContainer() for docker containers
> -----------------------------------------------
>
>                 Key: YARN-4759
>                 URL: https://issues.apache.org/jira/browse/YARN-4759
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Sidharta Seethana
>            Assignee: Shane Kumpf
>
> The current signal handling (in the DockerContainerRuntime) needs to be revisited for
docker containers. For example, container reacquisition on NM restart might not work, depending
on which user the process in the container runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message