mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avinash Sridharan (JIRA)" <>
Subject [jira] [Commented] (MESOS-5879) cgroups/net_cls isolator causing agent recovery issues
Date Sat, 08 Oct 2016 17:30:20 GMT


Avinash Sridharan commented on MESOS-5879:

I think we should keep this open, since even after we fix MESOS-6035 the Net_Cls subsystem
will need to be changed to use the non-recursive cgroups::get to fix this right?

> cgroups/net_cls isolator causing agent recovery issues
> ------------------------------------------------------
>                 Key: MESOS-5879
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>          Components: cgroups, isolation, slave
>            Reporter: Silas Snider
>            Assignee: Avinash Sridharan
>              Labels: mesosphere
> We run with 'cgroups/net_cls' in our isolator list, and when we restart any agent process
in a cluster running an experimental custom isolator as well, the agents are unable to recover
from checkpoint, because net_cls reports that unknown orphan containers have duplicate net_cls
> While this is a problem that needs to be solved (probably by fixing our custom isolator),
it's also a problem that the net_cls isolator fails recovery just for duplicate handles in
cgroups that it is literally about to unconditionally destroy during recovery. Can this be

This message was sent by Atlassian JIRA

View raw message