mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Peach <jor...@gmail.com>
Subject Re: mesos git commit: Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
Date Wed, 27 Jun 2018 06:42:41 GMT
Do we still need this check? The order of built-in isolators is now fixed, so do we still need
to verify this ordering?

> On Jun 27, 2018, at 4:17 PM, gilbert@apache.org wrote:
> 
> Repository: mesos
> Updated Branches:
>  refs/heads/master 2e913d545 -> b581136bd
> 
> 
> Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
> 
> Review: https://reviews.apache.org/r/67743/
> 
> 
> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b581136b
> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b581136b
> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b581136b
> 
> Branch: refs/heads/master
> Commit: b581136bd5d28ea74e410c7a57ac10d05c334b5b
> Parents: 2e913d5
> Author: Qian Zhang <zhq527725@gmail.com>
> Authored: Tue Jun 26 23:16:36 2018 -0700
> Committer: Gilbert Song <songzihao1990@gmail.com>
> Committed: Tue Jun 26 23:16:36 2018 -0700
> 
> ----------------------------------------------------------------------
> .../mesos/isolators/gpu/isolator.cpp            | 33 +++++++++++++++++---
> 1 file changed, 29 insertions(+), 4 deletions(-)
> ----------------------------------------------------------------------
> 
> 
> http://git-wip-us.apache.org/repos/asf/mesos/blob/b581136b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> ----------------------------------------------------------------------
> diff --git a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> index d79c940..5066882 100644
> --- a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> +++ b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> @@ -102,20 +102,45 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
>     const Flags& flags,
>     const NvidiaComponents& components)
> {
> -  // Make sure both the 'cgroups/devices' isolator and the
> -  // 'filesystem/linux' isolators are present and precede the GPU
> -  // isolator.
> +  // Make sure both the 'cgroups/devices' (or 'cgroups/all') isolator and the
> +  // 'filesystem/linux' isolators are present and precede the GPU isolator.
>   vector<string> tokens = strings::tokenize(flags.isolation, ",");
> 
>   auto gpuIsolator =
>     std::find(tokens.begin(), tokens.end(), "gpu/nvidia");
> +
>   auto devicesIsolator =
>     std::find(tokens.begin(), tokens.end(), "cgroups/devices");
> +
> +  auto cgroupsAllIsolator =
> +    std::find(tokens.begin(), tokens.end(), "cgroups/all");
> +
>   auto filesystemIsolator =
>     std::find(tokens.begin(), tokens.end(), "filesystem/linux");
> 
>   CHECK(gpuIsolator != tokens.end());
> 
> +  if (cgroupsAllIsolator != tokens.end()) {
> +    // The reason that we need to check if `devices` cgroups subsystem is
> +    // enabled is, when `cgroups/all` is specified in the `--isolation` agent
> +    // flag, cgroups isolator will only load the enabled subsystems. So if
> +    // `cgroups/all` is specified but `devices` is not enabled, cgroups isolator
> +    // will not load `devices` subsystem in which case we should error out.
> +    Try<bool> result = cgroups::enabled("devices");
> +    if (result.isError()) {
> +      return Error(
> +          "Failed to check if the `devices` cgroups subsystem"
> +          " is enabled by kernel: " + result.error());
> +    } else if (!result.get()) {
> +      return Error(
> +          "The `devices` cgroups subsystem is not enabled by the kernel");
> +    }
> +
> +    if (devicesIsolator > cgroupsAllIsolator) {
> +      devicesIsolator = cgroupsAllIsolator;
> +    }
> +  }
> +
>   if (devicesIsolator == tokens.end()) {
>     return Error("The 'cgroups/devices' isolator must be enabled in"
>                  " order to use the 'gpu/nvidia' isolator");
> @@ -127,7 +152,7 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
>   }
> 
>   if (devicesIsolator > gpuIsolator) {
> -    return Error("'cgroups/devices' must precede 'gpu/nvidia'"
> +    return Error("'cgroups/devices' or 'cgroups/all' must precede 'gpu/nvidia'"
>                  " in the --isolation flag");
>   }
> 
> 


Mime
View raw message