mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-2583) Tasks getting stuck in staging
Date Tue, 31 Mar 2015 23:43:53 GMT

    [ https://issues.apache.org/jira/browse/MESOS-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389680#comment-14389680
] 

Jie Yu commented on MESOS-2583:
-------------------------------

Vinod and I triaged it, from the log you pasted, we don't understand why 'docker stop' does
not trigger executorTerminated. Without that, the slave won't send TASK_LOST. cc [~tnachen]

> Tasks getting stuck in staging
> ------------------------------
>
>                 Key: MESOS-2583
>                 URL: https://issues.apache.org/jira/browse/MESOS-2583
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>    Affects Versions: 0.22.0
>            Reporter: Brenden Matthews
>         Attachments: Justin-Bieber_The-Beliebers-Want-to-Believe-2-650x406.jpg, Screen
Shot 2015-03-26 at 11.59.33 AM.png, Screen Shot 2015-03-30 at 2.04.14 PM.png, log.txt
>
>
> Tasks occasionally become stuck in the `TASK_STAGING` state after launching. It appears
that this affects both Docker and non-Docker tasks, especially those which start up and fail
immediately. Attached is a sample of the slave log as well as screenshots from a testing cluster
showing the tasks which are stuck in staging, and then a number of failed tasks which occurs
after restarting the slave process. Justin Bieber is provided for scale.
> This may be related to MESOS-1837, and quite possibly the same issue, but it remains
unclear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message