mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From June Taylor <j...@umn.edu>
Subject Re: orphaned_tasks cleanup and prevention method
Date Thu, 07 Apr 2016 15:06:17 GMT
Here is one of three orphaned tasks (first two octets of IP removed):

"orphan_tasks": [
        {
            "executor_id": "",
            "name": "Task 1",
            "framework_id": "14cddded-e692-4838-9893-6e04a81481d8-0006",
            "state": "TASK_RUNNING",
            "statuses": [
                {
                    "timestamp": 1459887295.05554,
                    "state": "TASK_RUNNING",
                    "container_status": {
                        "network_infos": [
                            {
                                "ip_addresses": [
                                    {
                                        "ip_address": "xxx.xxx.163.205"
                                    }
                                ],
                                "ip_address": "xxx.xxx.163.205"
                            }
                        ]
                    }
                }
            ],
            "slave_id": "182cf09f-0843-4736-82f1-d913089d7df4-S83",
            "id": "1",
            "resources": {
                "mem": 112640.0,
                "disk": 0.0,
                "cpus": 30.0
            }
        }

Going to this slave I can find an executor within the mesos working
directory which matches this framework ID. Reviewing the stdout messaging
within indicates the program has finished its work. But, it is still
holding these resources open.

This framework ID is not shown as Active in the main Mesos Web UI, but does
show up if you display the Slave's web UI.

The resources consumed count towards the Idle pool, and have resulted in
zero available resources for other Offers.



Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Thu, Apr 7, 2016 at 9:46 AM, haosdent <haosdent@gmail.com> wrote:

> > pyspark executors hanging around and consuming resources marked as Idle
> in mesos Web UI
>
> Do you have some logs about this?
>
> >is there an API call I can make to kill these orphans?
>
> As I know, mesos agent would try to clean orphan containers when restart.
> But I not sure the orphan I mean here is same with yours.
>
> On Thu, Apr 7, 2016 at 10:21 PM, June Taylor <june@umn.edu> wrote:
>
>> Greetings mesos users!
>>
>> I am debugging an issue with pyspark executors hanging around and
>> consuming resources marked as Idle in mesos Web UI. These tasks also show
>> up in the orphaned_tasks key in `mesos state`.
>>
>> I'm first wondering how to clear them out - is there an API call I can
>> make to kill these orphans? Secondly, how it happened at all.
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>

Mime
View raw message