mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: mesos git commit: Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
Date Thu, 28 Jun 2018 07:17:09 GMT
Thanks James for reminding this!

You are right, we do not need this check, I have posted a patch (
https://reviews.apache.org/r/67767/) to remove it, can you help review it?


Regards,
Qian Zhang

On Wed, Jun 27, 2018 at 2:42 PM, James Peach <jorgar@gmail.com> wrote:

> Do we still need this check? The order of built-in isolators is now fixed,
> so do we still need to verify this ordering?
>
> > On Jun 27, 2018, at 4:17 PM, gilbert@apache.org wrote:
> >
> > Repository: mesos
> > Updated Branches:
> >  refs/heads/master 2e913d545 -> b581136bd
> >
> >
> > Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
> >
> > Review: https://reviews.apache.org/r/67743/
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b581136b
> > Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b581136b
> > Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b581136b
> >
> > Branch: refs/heads/master
> > Commit: b581136bd5d28ea74e410c7a57ac10d05c334b5b
> > Parents: 2e913d5
> > Author: Qian Zhang <zhq527725@gmail.com>
> > Authored: Tue Jun 26 23:16:36 2018 -0700
> > Committer: Gilbert Song <songzihao1990@gmail.com>
> > Committed: Tue Jun 26 23:16:36 2018 -0700
> >
> > ----------------------------------------------------------------------
> > .../mesos/isolators/gpu/isolator.cpp            | 33
> +++++++++++++++++---
> > 1 file changed, 29 insertions(+), 4 deletions(-)
> > ----------------------------------------------------------------------
> >
> >
> > http://git-wip-us.apache.org/repos/asf/mesos/blob/b581136b/
> src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > ----------------------------------------------------------------------
> > diff --git a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > index d79c940..5066882 100644
> > --- a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > +++ b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > @@ -102,20 +102,45 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> >     const Flags& flags,
> >     const NvidiaComponents& components)
> > {
> > -  // Make sure both the 'cgroups/devices' isolator and the
> > -  // 'filesystem/linux' isolators are present and precede the GPU
> > -  // isolator.
> > +  // Make sure both the 'cgroups/devices' (or 'cgroups/all') isolator
> and the
> > +  // 'filesystem/linux' isolators are present and precede the GPU
> isolator.
> >   vector<string> tokens = strings::tokenize(flags.isolation, ",");
> >
> >   auto gpuIsolator =
> >     std::find(tokens.begin(), tokens.end(), "gpu/nvidia");
> > +
> >   auto devicesIsolator =
> >     std::find(tokens.begin(), tokens.end(), "cgroups/devices");
> > +
> > +  auto cgroupsAllIsolator =
> > +    std::find(tokens.begin(), tokens.end(), "cgroups/all");
> > +
> >   auto filesystemIsolator =
> >     std::find(tokens.begin(), tokens.end(), "filesystem/linux");
> >
> >   CHECK(gpuIsolator != tokens.end());
> >
> > +  if (cgroupsAllIsolator != tokens.end()) {
> > +    // The reason that we need to check if `devices` cgroups subsystem
> is
> > +    // enabled is, when `cgroups/all` is specified in the `--isolation`
> agent
> > +    // flag, cgroups isolator will only load the enabled subsystems. So
> if
> > +    // `cgroups/all` is specified but `devices` is not enabled, cgroups
> isolator
> > +    // will not load `devices` subsystem in which case we should error
> out.
> > +    Try<bool> result = cgroups::enabled("devices");
> > +    if (result.isError()) {
> > +      return Error(
> > +          "Failed to check if the `devices` cgroups subsystem"
> > +          " is enabled by kernel: " + result.error());
> > +    } else if (!result.get()) {
> > +      return Error(
> > +          "The `devices` cgroups subsystem is not enabled by the
> kernel");
> > +    }
> > +
> > +    if (devicesIsolator > cgroupsAllIsolator) {
> > +      devicesIsolator = cgroupsAllIsolator;
> > +    }
> > +  }
> > +
> >   if (devicesIsolator == tokens.end()) {
> >     return Error("The 'cgroups/devices' isolator must be enabled in"
> >                  " order to use the 'gpu/nvidia' isolator");
> > @@ -127,7 +152,7 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> >   }
> >
> >   if (devicesIsolator > gpuIsolator) {
> > -    return Error("'cgroups/devices' must precede 'gpu/nvidia'"
> > +    return Error("'cgroups/devices' or 'cgroups/all' must precede
> 'gpu/nvidia'"
> >                  " in the --isolation flag");
> >   }
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message