mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Toenshoff (JIRA)" <>
Subject [jira] [Commented] (MESOS-816) Allow delegation to shell scripts for isolation
Date Tue, 11 Feb 2014 06:01:28 GMT


Till Toenshoff commented on MESOS-816:

Hey Tom,

I had just updated and opened up for our early stage but
very flexible implementation of a pluggable containerizer. All of that is based on Ian's upcoming
patches ( etc as mentioned above). We are still in progress
of updating this as there certainly are a few nits to iron out.

Please feel very much encouraged to comment on that review request and/or share your thoughts
on IRC or the mailing list. :-)

> Allow delegation to shell scripts for isolation
> -----------------------------------------------
>                 Key: MESOS-816
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>          Components: isolation, slave
>            Reporter: Jason Dusek
>            Priority: Minor
>         Attachments: mesos-shell-isolator.jpg
>   Original Estimate: 72h
>  Remaining Estimate: 72h
> Being able to delegate isolation to shell scripts could make it easier to leverage the
machinery provided by the LXC tools, LibVirt, VirtualBox, Docker and similar containerization
> Why go through command line tools for isolation? We have seen many requests for isolation,
covering a wide variety of scenarios:
> - Setups requiring multiple versions of the same language (Ruby 1.8, Ruby 1.9).
> - Setups requiring installation and configuration of RPM-packaged applications.
> - Build-and-test setups, where sharing the environment of the host would impact reproducibility.
> - Integration of 3rd party, service-oriented applications.
> - Launching applications with Docker.
> - Launching multiple instances of a Mesos framework that, like Hadoop, has significant
system setup and dependencies.
> To cover these and other use cases, it seems reasonable to allow Mesos to delegate to
external programs for isolation:
> - It makes it easier to experiment with new containerization tools.
> - It allows for site administrators to customize containerization, or even implement
new containerization mechanisms, without impacting their ability to keep pace with Mesos development.
> - Many external programs exist for containerization -- Docker, LXC tools, LibVirt --
which handle a great deal of the book-keeping around finding and efficiently cloning disk
images and setting up the guest system (its hostname, TTYs, /dev/*, /proc).
> The scenarios listed above can be understood in terms of three use cases:
> - The containerized system service scenario, wherein an application, installed with RPM
or a similar tool, is started and managed by the init system within a container. Percona MySQL
is an example of such an application.
> - The containerized application scenario, wherein an application is installed or unpacked
and then configured and launched in a single command. For example, running a custom Rails
app with bundle install && bundle exec rails.
> - The containerized framework/executor scenario, wherein the application is Spark, Hadoop
or another Mesos framework/executor pair.
> One way to achieve this could be to introduce an External Isolator, which works in parallel
with the existing process/posix and cgroups isolators. The responsibility of this isolator
would be to act as a thin layer to external isolators. Calls for task launching, stopping
or any other resource change would be serialized and passed to the external isolators by the
Mesos External Isolator. 
> Allowing for pluggable isolators invites the possibility of having different isolators
per task. For applications using containers, it's reasonable that each application or framework
can specify a different base image; and this would be an option passed to the corresponding
isolator. One can also imagine specialized frameworks that need to disable isolation entirely.
For example, a "system backup" framework that would specify a null isolator to allow it to
snapshot interesting data on each slave and transfer it to a sanctioned storage location.
> However, for users and framework authors to specify isolators would both be harmful to
portability and would make isolation their problem, no longer something handled transparently
by Mesos. Furthermore, it would have the unintended effect of putting them at odds with site
administrators, who would also specify isolators -- as a command line option for each slave.
> Allowing tasks to carry a more abstract notion of "container" with them would allow for
most application level scenarios we've outlined above.  Theoretically, more than one isolator
might be able to handle a given container. For example if, the container is specified as an
"ISO" and a distro LiveCD is provided, one could imagine a Docker isolator, LXC isolator or
Virtualbox isolator handling it. Encouraging users and framework authors to specify a container
would be simpler for them than specifying isolator flags, allows them to more clearly document
their intent, and reduces the scope for conflict with other parties who have an interest in
upgrading and tuning isolation. It also makes applications and command examples more portable,
by decoupling the isolation mechanism from the desired container layout (which is, more or
less, a chroot with some files in it).
> To this end, we propose adding an optional ContainerInfo to each CommandInfo:
>     message CommandInfo {
>       message ContainerInfo {
>         required bytes image = 1;
>         repeated bytes options = 2;
>       }
>       ...
>       optional ContainerInfo container = 4;
>     }
> The first field of the ContainerInfo should indicate the image, perhaps as a URL. For
>     docker:///johncosta/redis
>     iso+
>     lxc:///ubuntu
> The scheme of the URL -- recognizable as a string of letters and digits and perhaps plusses,
dots and dashes preceding the first `://`, per RFC 3986 -- serves to indicate the type of
the container, which isolators can use to determine both what to do with a container and how
to obtain it. For the Docker URL type, for example, the absence of a host between the second
and third slashes could be interpreted to mean that the image should be fetched from the Docker
index or from a locally configured default Docker image server; whereas if a hostname is given,
it is treated as the image server to use.
> The addition of "options" to the ContainerInfo poses a risk to portability and warrants
both explanation and justification. In the case of Docker URLs, for example, it is possible
to mount additional filesystems on the Docker command line; and these filesystems can even
be indicated by reference to another Docker container by name. Support for this feature is
clearly tied to the Docker URL and its meaning.
> When the default isolator for a slave is specified, there may also be a default container
specified. It is good for us, then, that the ContainerInfo structure maps cleanly to an array
of byte strings, since this is an easy thing to handle from the command line.
> Now in practice, how will we use the ContainerInfo? In the three use cases outlined above
-- service container, command container and containerized executor -- tasks needing a special
container will specify an ExecutorInfo in the TaskInfo and not a bare CommandInfo. The ContainerInfo
would then be part of the CommandInfo embedded in the ExecutorInfo.
> To consider a specific case, were the Storm framework packaged in a container, then the
same container could be used both for Nimbus and the worker nodes:
> * Nimbus would be launched with a TaskInfo requesting the container and launching Nimbus.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "python /opt/storm/bin/storm go"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>                 options = [ "-p", "1337:8000" ]
>               }
>             }
>             ...
>           }
>           ...
>         }
> * Nimbus would launch executors with a TaskInfo requesting the very same container, but
specifying a different command.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "curl -sSfL http://storm.server:1337/conf/storm.yaml -o /opt/storm/conf/storm.yaml
&& python /opt/storm/bin/storm supervisor storm.mesos.MesosSupervisor"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>               }
>             }
>             ...
>           }
>           ...
>         }
> While in the near term we expect container URLs to be pretty specific to the containerization
mechanism, let us hope for a glorious future with URLs like `img:///ubuntu-13.04` that point
to well-known, portable images.

This message was sent by Atlassian JIRA

View raw message