mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vishant Singh (JIRA)" <>
Subject [jira] [Commented] (MESOS-8574) Docker executor makes no progress when 'docker inspect' hangs
Date Thu, 12 Apr 2018 14:44:00 GMT


Vishant Singh commented on MESOS-8574:



After going through all the comments, am bit confused.

are we adding timeout for docker inspect/stop?


we depending on task termination from the scheduler after "task_launch_timeout"?

> Docker executor makes no progress when 'docker inspect' hangs
> -------------------------------------------------------------
>                 Key: MESOS-8574
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>          Components: docker, executor
>    Affects Versions: 1.5.0
>            Reporter: Greg Mann
>            Assignee: Andrei Budnik
>            Priority: Major
>              Labels: mesosphere
>             Fix For: 1.3.3, 1.4.2, 1.5.1, 1.6.0
> In the Docker executor, many calls later in the executor's lifecycle are gated on an
initial {{docker inspect}} call returning:
> If that first call to {{docker inspect}} never returns, the executor becomes stuck in
a state where it makes no progress and cannot be killed.
> It's tempting for the executor to simply commit suicide after a timeout, but we must
be careful of the case in which the executor's Docker container is actually running successfully,
but the Docker daemon is unresponsive. In such a case, we do not want to send TASK_FAILED
or TASK_KILLED if the task's container is running successfully.

This message was sent by Atlassian JIRA

View raw message