mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 琪 冯 <>
Subject Re: About executor failover
Date Wed, 23 Mar 2016 13:07:18 GMT
I should say finally I found a way to clean orphans containers.
I learnt the executor will not remove its container when the task complete. Executor will
stop the container and exit. The container will in exit state and stay in the slave machine
until --docker_remove_delay.
I set --docker_remove_delay="1mins", and restarted the slave, and killed an executor process.
After 1 minute, the container left by the killed executor removed.
This may not be a good way to solve my problem. But it do.
Thank you haosdent. Thank you for your help. [&#X1f60a]

From: haosdent <>
Sent: Wednesday, March 23, 2016 11:17 AM
To: user
Subject: Re: About executor failover

But I think we could make sure docker container exit when kill executor. If you have clear
requirements, could you fill it in So other folks
could help check whether it should be accepted or not.

Mesos - ASF JIRA -<>
A list of upcoming versions. Click on the row to display issues for that version.

On Wed, Mar 23, 2016 at 7:14 PM, haosdent <<>>
As I know, could not know orphan containers in framework now.

On Wed, Mar 23, 2016 at 6:50 PM, 琪 冯 <<>>

Many thanks for reply!
I learnt the orphans containers were removed by the slave recovery. I mean, is there anything
I can do from the framework, or some other monitors to remove or detect them automatically.

Thanks for your helps.

From: haosdent <<>>
Sent: Wednesday, March 23, 2016 3:22 AM
To: user
Subject: Re: About executor failover

Yes, in that case, these orphans containers would be recovered or killed when you restart

On Wed, Mar 23, 2016 at 11:13 AM, ? ? <<>>

What if the executor process down with its docker container still alive?

As I tested, I killed an executor process in one of my mesos slave machines, the process detail
just like:

root     17166  9569  0 Mar22 ?        00:01:39 mesos-docker-executor --container=mesos-0d58cb85-e726-479a-a57a-83405e3ae580-S3.b995031b-9c46-4713-9050-518aa306c6aa
--docker=docker --docker_socket=/var/run/docker.sock --help=false --mapped_directory=/mnt/mesos/sandbox

The I checked the container with name "mesos-0d58cb85-e726-479a-a57a-83405e3ae580-S3.b995031b-9c46-4713-9050-518aa306c6aa"
was still alive.

My mesos version is 0.25.0. And the mesos slave machine kernel version is Linux 3.10.0-229.11.1.el7.x86_64.

I mean if executor process crashed/killed for whatever reasons(but the container is alive),
a new container will launch for the task_lost event. So a container created by the dead executor
process would be undiscoverable to my framework.

I want to know if I am wrong, or there is a way to handle this scenario.

I hope my question is clear, if not, please let me know.

Any feedback would be appreciated. [&#X1f60a]

Best Regards,
Haosdent Huang

Best Regards,
Haosdent Huang

Best Regards,
Haosdent Huang
View raw message