mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haosdent <haosd...@gmail.com>
Subject Re: About executor failover
Date Wed, 23 Mar 2016 11:14:36 GMT
As I know, could not know orphan containers in framework now.

On Wed, Mar 23, 2016 at 6:50 PM, 琪 冯 <Athlum5211@outlook.com> wrote:

>
> Many thanks for reply!
> I learnt the orphans containers were removed by the slave recovery. I
> mean, is there anything I can do from the framework, or some other monitors
> to remove or detect them automatically.
>
> Thanks for your helps.
>
>
> ------------------------------
> *From:* haosdent <haosdent@gmail.com>
> *Sent:* Wednesday, March 23, 2016 3:22 AM
> *To:* user
> *Subject:* Re: About executor failover
>
> Yes, in that case, these orphans containers would be recovered or killed
> when you restart slave.
>
> On Wed, Mar 23, 2016 at 11:13 AM, ? ? <Athlum5211@outlook.com> wrote:
>
>> What if the executor process down with its docker container still alive?
>>
>> As I tested, I killed an executor process in one of my mesos slave
>> machines, the process detail just like:
>>
>>
>> root     17166  9569  0 Mar22 ?        00:01:39 mesos-docker-executor
>> --container=mesos-0d58cb85-e726-479a-a57a-83405e3ae580-S3.b995031b-9c46-4713-9050-518aa306c6aa
>> --docker=docker --docker_socket=/var/run/docker.sock --help=false
>> --mapped_directory=/mnt/mesos/sandbox --sandbox_directory=/data/mesos/slaves/0d58cb85-e726-479a-a57a-83405e3ae580-S3/frameworks/5cfc9845-05c0-45b1-acc0-595ab92075d2-0000/executors/archtools_hearthstone.eless_eless.uwsgi.353f920b-eff6-11e5-97d3-aeb4726ea116/runs/b995031b-9c46-4713-9050-518aa306c6aa
>> --stop_timeout=0ns
>>
>>
>> The I checked the container with name "mesos-0d58cb85-e726-479a-a57a-83405e3ae580-S3.b995031b-9c46-4713-9050-518aa306c6aa"
>> was still alive.
>>
>>
>> My mesos version is 0.25.0. And the mesos slave machine kernel version is
>> Linux 3.10.0-229.11.1.el7.x86_64.
>>
>>
>> I mean if executor process crashed/killed for whatever reasons(but the
>> container is alive), a new container will launch for the task_lost event.
>> So a container created by the dead executor process would be undiscoverable
>> to my framework.
>>
>>
>> I want to know if I am wrong, or there is a way to handle this scenario.
>>
>>
>> I hope my question is clear, if not, please let me know.
>>
>>
>> Any feedback would be appreciated. [image: &#X1f60a]
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Best Regards,
Haosdent Huang

Mime
View raw message