mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Chen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MESOS-1824) when "docker ps -a" returns 400+ lines enabling docker containerizer results in all executors dying
Date Mon, 20 Oct 2014 16:28:36 GMT

     [ https://issues.apache.org/jira/browse/MESOS-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Timothy Chen resolved MESOS-1824.
---------------------------------
    Resolution: Fixed

> when "docker ps -a" returns 400+ lines enabling docker containerizer results in all executors
dying
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-1824
>                 URL: https://issues.apache.org/jira/browse/MESOS-1824
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>            Reporter: Jay Buffington
>            Assignee: Timothy Chen
>
> To reproduce:
> # run this one-liner on your slave to create 400 exited docker containers:
> {noformat}
> for i in `seq 1 400`; do docker run busybox:latest echo "hello" ; done;
> {noformat}
> # Start mesos-slave with only mesos containerizer enabled
> # Launch tasks that use an executor (which uses libmesos)
> # Restart mesos-slave process with --containerizer=docker,mesos
> # See mesos-slave fork "docker ps -a" and never return
> # Note that this mesos-slave never reregisters with master
> # Wait at least 10 minutes and see executors commit suicide, which kills all of the tasks
on your system.  From executor log:
> {noformat}
> I0919 21:24:14.018127 21778 exec.cpp:379] Executor asked to shutdown
> I0919 21:24:14.018812 21771 exec.cpp:78] Scheduling shutdown of the executor
> I0919 21:24:14.020514 21778 exec.cpp:394] Executor::shutdown took 1.866382ms
> I0919 21:24:16.000500 21771 exec.cpp:525] Executor sending status update TASK_KILLED
(UUID: bfd3969c-ad0a-455a-93fe-06c37bdee513) for task 1411160025479-another-task-0-b5e24381-3353-43d4-9587-ffef9ccf2f38
of framework 20140814-221057-1208029356-5050-10525-0000
> I0919 21:24:16.030253 21772 exec.cpp:332] Ignoring status update acknowledgement bfd3969c-ad0a-455a-93fe-06c37bdee513
for task 1411160025479-another-task-0-b5e24381-3353-43d4-9587-ffef9ccf2f38 of framework 20140814-221057-1208029356-5050-10525-0000
because the driver is aborted!
> I0919 21:24:19.021966 21778 exec.cpp:86] Committing suicide by killing the process group
> {noformat}
> # mesos-slave fails to tell the master about tasking be killed with this message in the
log:
> {noformat}
> W0918 01:02:57.252231 11725 status_update_manager.cpp:381] Not
> forwarding status update TASK_KILLED (UUID:
> 6fbacbcf-ad0f-4e89-89ee-e9f88a618573) for task
> 1410298578043-some-task-30-29279377-fdf2-4bb7-b862-852adddea09c
> of framework 20140522-213145-1749004561-5050-29512-0000 because no
> master is elected yet
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message