hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period
Date Tue, 28 Aug 2018 00:26:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594348#comment-16594348
] 

Chandni Singh commented on YARN-8706:
-------------------------------------

I can see 2 ways for addressing this:

Approach 1:
1. Deprecate {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}. 
{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} will trigger container kill after the
delay ms. 
2. Nothing else changes. By default, docker stop uses grace period of 10 seconds and even
if {{DelayedProcessKiller}} executes after this, it will check whether the process is in stoppable
state.

This requires no code change except deprecating {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}


Approach 2: 
1.  Deprecate {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}. 
{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}
2. For Docker Runtime,  rely only on docker stop to calculate grace period in seconds from

{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} 
3. {{DelayedProcessKiller}} is NOT executed for Docker Runtime but executed for the other
runtimes.

This requires a lot of change:
1. {{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}  needs to be passed to {{DockerLinuxContainerRuntime}}
2. {{DelayedProcessKiller}} should be executed for all runtimes except {{DockerLinuxContainerRuntime}}


NOTE: {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}} should be deprecated in both cases


> DelayedProcessKiller is executed for Docker containers even though docker stop sends
a KILL signal after the specified grace period
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8706
>                 URL: https://issues.apache.org/jira/browse/YARN-8706
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and after a grace
period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes for all
containers after a delay when {{sleepDelayBeforeSigKill>0}}. By default this is set to
{{250 milliseconds}} and so irrespective of the container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} after the
grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be the smallest
value, which is 1 second, because anyways we are forcing kill after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message