hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period
Date Tue, 28 Aug 2018 21:16:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595603#comment-16595603
] 

Chandni Singh edited comment on YARN-8706 at 8/28/18 9:15 PM:
--------------------------------------------------------------

{quote}
Docker stop already covers sending the custom signal, and also 10 second grace period, then
SIGKILL. I think it would be safe to skip DelayProcessKiller for docker containers.
{quote}
[~eyang] I want to highlight some issues with this approach:
- NM has a setting {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} that promises to kill the containers
(irrespective of their types) after this delay. This setting is in milliseconds and docker
stop takes only seconds as arguments. This creates discrepancy in the grace period to be exact
with what the user specified using {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}. This is assuming
that we will deprecate {{NM_DOCKER_STOP_GRACE_PERIOD}} as it is redundant.

- From NM's perspective, regardless of the runtime, it needs to kill the process after the
period specified in millis which I think is correct because since we are already in the process
of integrating additional runtimes and NM needs to guarantee the container will definitely
be killed and therefore, the resources used by the container will be released. 




was (Author: csingh):
{quote}
Docker stop already covers sending the custom signal, and also 10 second grace period, then
SIGKILL. I think it would be safe to skip DelayProcessKiller for docker containers.
{quote}
[~eyang] I want to highlight some issues with this approach:
- NM has a setting {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} that promises to kill the containers
(irrespective of their types) after this delay. This setting is in milliseconds and docker
stop takes only seconds as arguments. This creates discrepancy in the grace period to be exact
as what the user specified with {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}. This is assuming that
we will deprecate {{NM_DOCKER_STOP_GRACE_PERIOD}} as it is redundant.

- From NM's perspective, regardless of the runtime, it needs to kill the process after the
period specified in millis which I think is correct because since we are already in the process
of integrating additional runtimes and NM needs to guarantee the container will definitely
be killed and therefore, the resources used by the container will be released. 



> DelayedProcessKiller is executed for Docker containers even though docker stop sends
a KILL signal after the specified grace period
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8706
>                 URL: https://issues.apache.org/jira/browse/YARN-8706
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and after a grace
period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes for all
containers after a delay when {{sleepDelayBeforeSigKill>0}}. By default this is set to
{{250 milliseconds}} and so irrespective of the container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} after the
grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be the smallest
value, which is 1 second, because anyways we are forcing kill after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message