hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Badger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8206) Sending a kill does not immediately kill docker containers
Date Mon, 07 May 2018 22:03:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466529#comment-16466529
] 

Eric Badger commented on YARN-8206:
-----------------------------------

[~eyang], [~shanekumpf@gmail.com], [~jlowe], [~Jim_Brennan], I've come up with 2 different
ways to solve the privileged container issue and I'd like your input on which route to go
(though I have a slight preference). In both proposals, non-privileged containers will be
signaled using {{kill}} instead of {{docker kill}}

Proposal 1:
The container-executor will not give up being root when it goes to signal the container, and
will thus signal the container as root. This would only be for privileged containers, but
is something that is not currently possible (right now, signaling has to call {{set_user()}}
and you cannot set "root" as the user. 

Proposal 2:
Use the docker API for privileged containers, just like the code does today. This way, we
won't be killing arbitrary process as root, just wielding the docker daemon as root as we
do today. The downside here is that we have to go through a docker API call, which is slower
than just sending the signal straight to the process. 

My preference would be for Proposal 2 as I'm not super comfortable allowing the container-executor
to kill arbitrary processes as root, if you were somehow able to compromise the NM user. Currently,
you would only be able to kill arbitrary non-root processes, if you comprised the NM user.

> Sending a kill does not immediately kill docker containers
> ----------------------------------------------------------
>
>                 Key: YARN-8206
>                 URL: https://issues.apache.org/jira/browse/YARN-8206
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>            Priority: Major
>              Labels: Docker
>         Attachments: YARN-8206.001.patch, YARN-8206.002.patch, YARN-8206.003.patch, YARN-8206.004.patch
>
>
> {noformat}
>         if (ContainerExecutor.Signal.KILL.equals(signal)
>             || ContainerExecutor.Signal.TERM.equals(signal)) {
>           handleContainerStop(containerId, env);
> {noformat}
> Currently in the code, we are handling both SIGKILL and SIGTERM as equivalent for docker
containers. However, they should actually be separate. When YARN sends a SIGKILL to a process,
it means for it to die immediately and not sit around waiting for anything. This ensures an
immediate reclamation of resources. Additionally, if a SIGTERM is sent before the SIGKILL,
the task might not handle the signal correctly, and will then end up as a failed task instead
of a killed task. This is especially bad for preemption. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message