mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anand Mazumdar (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (MESOS-3766) Can not kill task in Status STAGING
Date Fri, 23 Oct 2015 20:32:27 GMT

     [ https://issues.apache.org/jira/browse/MESOS-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anand Mazumdar reassigned MESOS-3766:
-------------------------------------

    Assignee: Anand Mazumdar  (was: Niklas Quarfot Nielsen)

> Can not kill task in Status STAGING
> -----------------------------------
>
>                 Key: MESOS-3766
>                 URL: https://issues.apache.org/jira/browse/MESOS-3766
>             Project: Mesos
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 0.25.0
>         Environment: OSX 
>            Reporter: Matthias Veit
>            Assignee: Anand Mazumdar
>         Attachments: master.log.zip, slave.log.zip
>
>
> I have created a simple Marathon Application with instance count 100 (100 tasks) with
a simple sleep command. Before all tasks were running, I killed all tasks. This operation
was successful, except 2 tasks. These 2 tasks are in state STAGING (according to the mesos
UI). Marathon tries to kill those tasks every 5 seconds (for over an hour now) - unsuccessfully.
> I picked one task and grepped the slave log:
> {noformat}
> I1020 12:39:38.480478 315482112 slave.cpp:1270] Got assigned task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
for framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:39:38.887559 315482112 slave.cpp:1386] Launching task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
for framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:39:38.898221 315482112 slave.cpp:4852] Launching executor app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000 with resour
> I1020 12:39:38.899521 315482112 slave.cpp:1604] Queuing task 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d'
for executor app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework '80
> I1020 12:39:39.740401 313872384 containerizer.cpp:640] Starting container '5ce75a17-12db-4c8f-9131-b40f8280b9f7'
for executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' of fr
> I1020 12:39:40.495931 313872384 containerizer.cpp:873] Checkpointing executor's forked
pid 37096 to '/tmp/mesos/meta/slaves/80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0/frameworks
> I1020 12:39:41.744439 313335808 slave.cpp:2379] Got registration for executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d'
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-000
> I1020 12:39:42.080734 313335808 slave.cpp:1760] Sending queued task 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d'
to executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' of frame
> I1020 12:40:13.073390 312262656 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:18.079651 312262656 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:23.097504 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:28.118443 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:33.138137 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:38.158529 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:43.177901 314408960 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:48.197852 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:53.216672 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:40:58.238471 314945536 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:41:03.256614 312799232 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:41:08.276450 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:41:13.297114 315482112 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:41:18.316463 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> I1020 12:41:23.337116 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> .
> .
> .
> I1020 14:11:03.614157 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
> {noformat}
> master log looks like this:
> {noformat}
> I1020 12:39:38.044208 351387648 master.hpp:176] Adding task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
with resources cpus(*):0.1; mem(*):16; ports(*):[31232-31232] on slave 80
> I1020 12:39:38.044494 351387648 master.cpp:3248] Launching task app.dc98434b-7716-11e5-a5fc-1ea69edef42d
of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000 (marathon) at 
> I1020 12:40:13.061883 350314496 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:18.079074 351387648 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:23.097110 352460800 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:28.117952 352997376 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:33.137667 352460800 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:38.157832 354070528 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> I1020 12:40:43.177223 353533952 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> .
> .
> .
> I1020 14:11:33.611827 353533952 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0
at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
> {noformat}
> The sandbox: stdout is empty and stderr has following content:
> {noformat}
> I1020 12:39:41.551882 2047558400 exec.cpp:134] Version: 0.25.0
> {noformat}
> Just for reference, this was the Marathon Application used:
> {noformat}
> {
>   "id": "/app", 
>   "mem": 16.0, 
>   "cmd": "sleep 10000", 
>   "cpus": 0.1, 
>   "disk": 0.0, 
>   "env": {
>       "foo": "bla"
>   } 
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message