mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Klues <klue...@gmail.com>
Subject Re: Review Request 47707: Refactored resource enumeration for containerizers and isolators.
Date Mon, 23 May 2016 20:24:48 GMT


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/composing.cpp, lines 276-280
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391148#file1391148line276>
> >
> >     How about "managed" and "manage" instead of "enumerated" and "enumerate" here?
It seems like the important thing to understand in this comment is that each containerizer
is trying to manage resources?

Sounds good to me.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/composing.cpp, lines 304-306
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391148#file1391148line304>
> >
> >     Hm.. how would this read?
> >     
> >     Failed to merge 'cpus' managed by the composed containerizers: one containerizer
is managing 2cpus and another is managing 3cpus
> >     
> >     You could print this by stringifying the filtered resources.

Sure, this sounds fine as well.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/containerizer.hpp, lines 64-68
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391149#file1391149line64>
> >
> >     Hm.. it seems more like the default set of resources would not include the flags.
> >     
> >     The current function is a little more than just the default resources, it is
the flag provided resources along with defaults injected in for any that were omitted.
> >     
> >     Could we instead provide just the defaults (per the suggested signature above)
and have containerizer implementations inject defaults with the help of this function?
> >     
> >     As an example:
> >     
> >     ```
> >     // Provides a means for containerizer implementation to use a 
> >     // consistent set of default resources when resources are not
> >     // specified via the '--resources' flag. The containerizer may
> >     // choose its own default value by probing the system, however
> >     // this helper provides a standard set of defaults for:
> >     //
> >     //     ["cpus", "mem", "disk", "ports"]
> >     //
> >     // A containerizer is free to choose alternative defaults for these
> >     // resources, but this increases the likelihood that containerizers
> >     // disagree about the amount of resources to manage when used within
> >     // the composing containerizer.
> >     static Try<Resources> defaultResources();
> >     ```
> >     
> >     This means we could augment the Containerizer::resources and Isolator::resources
to do two things:
> >     
> >     (1) Get informed about the resources provided via flags, and
> >     (2) Return additional resources that the containerizer or isolator wishes to
manage.
> >     
> >     Something like:
> >     
> >     ```
> >     // Informs the containerizer / isolator about the resources
> >     // that are explicitly specified. The containerizer / isolator
> >     // returns a subset of the resources that it will manage. The
> >     // containerizer / isolator may also return additional resources
> >     // that it wishes to manage if they were not specified.
> >     // For example, a GPU isolator, seeing that "gpus" is unspecified
> >     // may probe the system and decide to manage some GPUs (i.e. it
> >     // will return these GPUs from this function).
> >     Future<Resources> manage(const Resources& provided);
> >     ```

We could do this (though it feels a little less intuitive and less flexible than allowing
arbitrary flags to be passed in).  We are also unnecessarily enumerating all of the `cpu,
mem, disk, ports` available on the system even if the flags would indicate to override them.
Is the assumption that the actual class implementing `manage()` would need to find a way to
get at the flags itself if it relies on them?  How would the default implementation of `manage()`
work?  The whole goal of it was to avoid having to push new logic down into each existing
containerizer (which your proposal will require in order to do the flags parsing somehwere
to override the defaults).


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/containerizer.cpp, lines 57-61
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391150#file1391150line57>
> >
> >     Why this comment in the .cpp in addition to the one in the .hpp?

I can remove it.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 479-480
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391152#file1391152line479>
> >
> >     Ditto on "manage" vs "enumerate" here.

OK.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp, line 242
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391156#file1391156line242>
> >
> >     It's a bit implicit that this condition ensures that `resources.gpus().isSome()`,
could we store the return value of `resources.gpus()` and use it in both the condition and
the validation?
> >     
> >     ```
> >     Option<double> gpus = resources.gpus();
> >     
> >     if (gpus.isSome()) {
> >       ...
> >       long long millis = static_cast<long long>(gpus.get() * 1000);
> >       ...
> >     }
> >     ```
> >     
> >     This seems clearer?

I was tinkering with something like this, but didn't have a good name for it (`gpus`) is already
taken. I can figure somthing out if you think it makes it more readable.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/slave.cpp, lines 476-478
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391157#file1391157line476>
> >
> >     Would you mind adding a TODO to avoid waiting forever? I'm just thinking of
the debugability when there's a inifitely pending future coming out of an isolator module.

Sure.


> On May 23, 2016, 7:56 p.m., Benjamin Mahler wrote:
> > src/slave/slave.cpp, lines 481-484
> > <https://reviews.apache.org/r/47707/diff/1/?file=1391157#file1391157line481>
> >
> >     Could you handle the discarded case? We don't expect a discarded result since
we do not request a discard on this future, but it would be nice to avoid the call to .get()
without validating this:
> >     
> >     ```
> >     CHECK_READY(resources);
> >     ```
> >     
> >     Or modify the error handling:
> >     
> >     ```
> >       if (!resources.isReady()) {
> >         EXIT(EXIT_FAILURE)
> >           << "Failed to determine agent resources: " << resources.isFailed()
? resources.failure() : "discarded";
> >       }
> >     ```

Sure.


- Kevin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47707/#review134414
-----------------------------------------------------------


On May 23, 2016, 12:45 a.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47707/
> -----------------------------------------------------------
> 
> (Updated May 23, 2016, 12:45 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Jie Yu.
> 
> 
> Bugs: MESOS-5256
>     https://issues.apache.org/jira/browse/MESOS-5256
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the `Containerizer` class contained a static `resources()`
> function for enumerating all of the resources available on an agent.
> This made it hard to add new resources to this enumeration that were,
> for example, containerizer-specific or tied to a new isolator module
> loaded at runtime.
> 
> This commit refactors the resource enumeration logic to allow each
> containerizer to enumerate resources itself. It does this by adding a
> new virtual `resources()` function, whose default implementation
> includes the same logic as the old static version of `resources()`.
> Individual containerizers simply overwrite this function to do their
> own enumeration if they want to.
> 
> Similar logic exists to push resource enumeration down into isolators
> as well. As such, the logic for the Nvidia GPU resource enumeration has
> been moved down into the Nvidia GPU isolator instead of being
> hard-coded into the mesos containerizer itself.
> 
> 
> Diffs
> -----
> 
>   include/mesos/slave/isolator.hpp 4be8c2bb409052e2e07138483408209384f41e23 
>   src/slave/containerizer/composing.hpp f3eebd19bc9e6b3b8a969a2ad967b3e2909e0ee4 
>   src/slave/containerizer/composing.cpp 15d059f0bbda4e8cb93c65c09327dde1e34d3e7b 
>   src/slave/containerizer/containerizer.hpp ff78b4d0fd4a3b862f6019fc757c16b7367cd3cf

>   src/slave/containerizer/containerizer.cpp faa0c789dda8a6f36fdb6217b0bae270b6b8f2e2

>   src/slave/containerizer/mesos/containerizer.hpp a1a00020668f6da8d611f26e5637afffc87d09ba

>   src/slave/containerizer/mesos/containerizer.cpp 75e5a32a3e70ec60a6800e21a621673184ea0956

>   src/slave/containerizer/mesos/isolator.hpp bacd86af42d16cb7c9b6622dfb298dcaa7007b75

>   src/slave/containerizer/mesos/isolator.cpp 253ff3cea8aff3e7a3051fb5a763cc081f455f18

>   src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.hpp 502204650192d5ea44aa631eac8eb37e051843f0

>   src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp 8f81cb79c10261670efc9eaa8614751854f53806

>   src/slave/slave.cpp ce0e7b1f1d17c3b82d835b0a6296ed7b1e9eeac1 
> 
> Diff: https://reviews.apache.org/r/47707/diff/
> 
> 
> Testing
> -------
> 
> make -j check ( which fails on one test -- see https://reviews.apache.org/r/47708/ )
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message