mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haosdent huang <haosd...@gmail.com>
Subject Re: Review Request 49827: Implemented `CgroupsIsolatorProcess::cleanup`.
Date Sun, 21 Aug 2016 05:52:05 GMT


> On July 28, 2016, 12:01 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, line 486
> > <https://reviews.apache.org/r/49827/diff/9/?file=1453251#file1453251line486>
> >
> >     Would love to understand why test will cause this issue.
> 
> Qian Zhang wrote:
>     It seems that comment was introduced by this ticket: https://issues.apache.org/jira/browse/MESOS-1438
, and in the test code here: https://github.com/apache/mesos/blob/1.0.0/src/tests/cluster.cpp#L552:L559
, I see each container created in the test will be cleaned up, and we have already initiated
cleanup for those containers in the test code, that might be where the race occurs.
> 
> haosdent huang wrote:
>     Thx @qian!
> 
> Jie Yu wrote:
>     Calling `destroy` on Containerizer multiple times should not cause `isolator->destroy`
being called multiple times. I still don't get it

Hi, @jieyu. I check the code again. We have 

```
  if (container->state == DESTROYING) {
    VLOG(1) << "Destroy has already been initiated for '" << containerId <<
"'";
    return;
  }
```

so multiple `MesosContainerizer::destroy` should be safe.

But we still need this for these 2 cases:

1. `MesosContainerizer::prepare` are running serially. Suppose we failed in other isolator
preparation and have not performed `CgroupsIsolator::prepare`. `MesosContainerizer::destroy`
would be run in Mesos Agent. And then it would run `CgroupsIsolator::destroy` as well.
2. Suppose we start Mesos Agent with `--isolation=cgroups/cpu,cgroups/mem` and start some
containers. And restart Mesos Agent with a new isolation flag `--isolation=cgroups/cpu,cgroups/mem,cgroups/perf_event`.
Then when recover those exists containers, only `cpu`, `cpuacct`, `memory` subsystems could
recovered successfully. But when call `CgroupsIsolator::destroy`, we would try to cleanup
all known subsystems `cpu`, `cpuacct`, `memory`, `perf_event`. This is what the test case
`MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PERF_RollForward` do exactly.


- haosdent


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49827/#review143861
-----------------------------------------------------------


On July 31, 2016, 5:47 p.m., haosdent huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49827/
> -----------------------------------------------------------
> 
> (Updated July 31, 2016, 5:47 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
>     https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Implemented `CgroupsIsolatorProcess::cleanup`.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 422b4653fc3cb3d14c94b93ff456625fc59fbb27

>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp bd20631a9650cf84e99c6489b2e92bc40ed764ca

> 
> Diff: https://reviews.apache.org/r/49827/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> haosdent huang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message