mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Qian Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-6149) Checkpoint used subsystems for containers
Date Tue, 13 Sep 2016 01:20:21 GMT

    [ https://issues.apache.org/jira/browse/MESOS-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15485880#comment-15485880
] 

Qian Zhang commented on MESOS-6149:
-----------------------------------

I think in MESOS-6063, we have handled the case that agent is restarted with more cgroups
subsystems enabled, and in this ticket, we are going to handle the case that agent is restarted
with less cgroups subsystems enabled, e.g., before agent is restarted, the enabled subsystems
are {{cgroups/cpu,cgroups/mem,cgroups/net_cls}}, after agent is restarted, the enabled subsystems
are {{cgroups/cpu,cgroups/mem}}, i.e., {{cgroups/net_cls}} is disabled after agent is restarted.

However, I am not sure if checkpointing used subsystems for container can help to handle this
case, because I think once a subsystem is disabled after agent is restarted, even we have
checkpointed used subsystems for container, when the container terminates, we still have no
chance to do any cleanup for the subsystem which is disabled (because agent will not call
that subsystem at all), so the cgroups created for the container will remain there as a garbage
data.

One possible solution in my mind is, for the container which is created before agent restarts,
we will still use its checkpointed subsystems (even some of them are disabled after agent
restart), but for new containers created after agent restarts, we will just use the subsystems
enabled in agent.

> Checkpoint used subsystems for containers
> -----------------------------------------
>
>                 Key: MESOS-6149
>                 URL: https://issues.apache.org/jira/browse/MESOS-6149
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: haosdent
>            Assignee: haosdent
>
> In MESOS-6063, we have tracked recovered and prepared subsystems for containers. To make
it works better, we could checkpoint this information and recover it after Agent restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message