mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-8572) Make Docker executor/containerizer resilient to Docker daemon failures.
Date Tue, 20 Mar 2018 10:51:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406116#comment-16406116
] 

Alexander Rukletsov commented on MESOS-8572:
--------------------------------------------

[~brat002] reports a very similar issue. Below is my loose translation of the message he sent
over private channels.

"Sometimes docker task does not finish correctly on {{docker stop}}. For example in https://pastebin.com/NwgA7d7M,
{{docker stop}} hung 10 days (!). Manually issued {{docker stop}} from terminal hangs for
20-30 seconds and then exits cleanly, but does not stop the container. However, if {{kill
-9}} is sent to the corresponding {{mesos-docker-executor}}, the whole process tree terminates
correctly and {{docker ps}} does not list the container any more.

The hypothesis is that docker cannot terminate a container while someone is listening to its
stdin/stderr. Hence it might make sense to send {{SIGTERM}} followed by {{SIGKILL}} instead
of retrying {{docker stop}}."

> Make Docker executor/containerizer resilient to Docker daemon failures.
> -----------------------------------------------------------------------
>
>                 Key: MESOS-8572
>                 URL: https://issues.apache.org/jira/browse/MESOS-8572
>             Project: Mesos
>          Issue Type: Epic
>          Components: containerization, docker, executor
>    Affects Versions: 1.5.0
>            Reporter: Greg Mann
>            Assignee: Greg Mann
>            Priority: Major
>              Labels: mesosphere
>
> Experience has shown that the Docker CLI can hang indefinitely at times. There are many
variations of this behavior, and it occurs across many versions of Docker. For these reasons,
and since many users of Mesos still make heavy use of the Docker containerizer and the Docker
executor, it will improve the user experience to make the Docker containerizer/executor resilient
to such Docker daemon failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message