mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kone (JIRA)" <>
Subject [jira] [Commented] (MESOS-816) Allow delegation to shell scripts for isolation
Date Mon, 13 Jan 2014 03:08:56 GMT


Vinod Kone commented on MESOS-816:

Hey Jason. Thanks for updating the description regarding the proposal.

A few of us spent the last week trying to hack up a way to integrate Docker into Mesos (in
a different way than what MesosSphere did). We more or less ended up with the same ContainerInfo
format that you describe above, which is great.

A few thoughts from our last week's exercise

--> When launching executors inside containers we need to make sure that the image requested
should include mesos libs. This is because containers may not be able to access the libs installed
on the host system. This likely means the system admins installing mesos should provide users
with a set of mesos included images to work with.

--> While the above scenario is relatively easy to accomplish, the hard scenarios are when
user specifies an arbitrary task/executor to be run with an arbitrary image. How do we build
and get the appropriate mesos libs compatible with the given image in to the container? This
is not entirely clear to us at this point.

--> Also this ticket seems to be conflating supporting a custom/external container with
having file system isolation in mesos. The latter could be achieved without needing the former.
That said Docker, LXC and others have made it easy to do the file system isolation. So our
future designs for containerization/isolation should keep an eye out for making it easy to
support them.

> Allow delegation to shell scripts for isolation
> -----------------------------------------------
>                 Key: MESOS-816
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>          Components: isolation, slave
>            Reporter: Jason Dusek
>            Priority: Minor
>         Attachments: mesos-shell-isolator.jpg
>   Original Estimate: 72h
>  Remaining Estimate: 72h
> Being able to delegate isolation to shell scripts could make it easier to leverage the
machinery provided by the LXC tools, LibVirt, VirtualBox, Docker and similar containerization
> Why go through command line tools for isolation? We have seen many requests for isolation,
covering a wide variety of scenarios:
> - Setups requiring multiple versions of the same language (Ruby 1.8, Ruby 1.9).
> - Setups requiring installation and configuration of RPM-packaged applications.
> - Build-and-test setups, where sharing the environment of the host would impact reproducibility.
> - Integration of 3rd party, service-oriented applications.
> - Launching applications with Docker.
> - Launching multiple instances of a Mesos framework that, like Hadoop, has significant
system setup and dependencies.
> To cover these and other use cases, it seems reasonable to allow Mesos to delegate to
external programs for isolation:
> - It makes it easier to experiment with new containerization tools.
> - It allows for site administrators to customize containerization, or even implement
new containerization mechanisms, without impacting their ability to keep pace with Mesos development.
> - Many external programs exist for containerization -- Docker, LXC tools, LibVirt --
which handle a great deal of the book-keeping around finding and efficiently cloning disk
images and setting up the guest system (its hostname, TTYs, /dev/*, /proc).
> The scenarios listed above can be understood in terms of three use cases:
> - The containerized system service scenario, wherein an application, installed with RPM
or a similar tool, is started and managed by the init system within a container. Percona MySQL
is an example of such an application.
> - The containerized application scenario, wherein an application is installed or unpacked
and then configured and launched in a single command. For example, running a custom Rails
app with bundle install && bundle exec rails.
> - The containerized framework/executor scenario, wherein the application is Spark, Hadoop
or another Mesos framework/executor pair.
> One way to achieve this could be to introduce an External Isolator, which works in parallel
with the existing process/posix and cgroups isolators. The responsibility of this isolator
would be to act as a thin layer to external isolators. Calls for task launching, stopping
or any other resource change would be serialized and passed to the external isolators by the
Mesos External Isolator. 
> Allowing for pluggable isolators invites the possibility of having different isolators
per task. For applications using containers, it's reasonable that each application or framework
can specify a different base image; and this would be an option passed to the corresponding
isolator. One can also imagine specialized frameworks that need to disable isolation entirely.
For example, a "system backup" framework that would specify a null isolator to allow it to
snapshot interesting data on each slave and transfer it to a sanctioned storage location.
> However, for users and framework authors to specify isolators would both be harmful to
portability and would make isolation their problem, no longer something handled transparently
by Mesos. Furthermore, it would have the unintended effect of putting them at odds with site
administrators, who would also specify isolators -- as a command line option for each slave.
> Allowing tasks to carry a more abstract notion of "container" with them would allow for
most application level scenarios we've outlined above.  Theoretically, more than one isolator
might be able to handle a given container. For example if, the container is specified as an
"ISO" and a distro LiveCD is provided, one could imagine a Docker isolator, LXC isolator or
Virtualbox isolator handling it. Encouraging users and framework authors to specify a container
would be simpler for them than specifying isolator flags, allows them to more clearly document
their intent, and reduces the scope for conflict with other parties who have an interest in
upgrading and tuning isolation. It also makes applications and command examples more portable,
by decoupling the isolation mechanism from the desired container layout (which is, more or
less, a chroot with some files in it).
> To this end, we propose adding an optional ContainerInfo to each CommandInfo:
>     message CommandInfo {
>       message ContainerInfo {
>         required bytes image = 1;
>         repeated bytes options = 2;
>       }
>       ...
>       optional ContainerInfo container = 4;
>     }
> The first field of the ContainerInfo should indicate the image, perhaps as a URL. For
>     docker:///johncosta/redis
>     iso+
>     lxc:///ubuntu
> The scheme of the URL -- recognizable as a string of letters and digits and perhaps plusses,
dots and dashes preceding the first `://`, per RFC 3986 -- serves to indicate the type of
the container, which isolators can use to determine both what to do with a container and how
to obtain it. For the Docker URL type, for example, the absence of a host between the second
and third slashes could be interpreted to mean that the image should be fetched from the Docker
index or from a locally configured default Docker image server; whereas if a hostname is given,
it is treated as the image server to use.
> The addition of "options" to the ContainerInfo poses a risk to portability and warrants
both explanation and justification. In the case of Docker URLs, for example, it is possible
to mount additional filesystems on the Docker command line; and these filesystems can even
be indicated by reference to another Docker container by name. Support for this feature is
clearly tied to the Docker URL and its meaning.
> When the default isolator for a slave is specified, there may also be a default container
specified. It is good for us, then, that the ContainerInfo structure maps cleanly to an array
of byte strings, since this is an easy thing to handle from the command line.
> Now in practice, how will we use the ContainerInfo? In the three use cases outlined above
-- service container, command container and containerized executor -- tasks needing a special
container will specify an ExecutorInfo in the TaskInfo and not a bare CommandInfo. The ContainerInfo
would then be part of the CommandInfo embedded in the ExecutorInfo.
> To consider a specific case, were the Storm framework packaged in a container, then the
same container could be used both for Nimbus and the worker nodes:
> * Nimbus would be launched with a TaskInfo requesting the container and launching Nimbus.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "python /opt/storm/bin/storm go"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>                 options = [ "-p", "1337:8000" ]
>               }
>             }
>             ...
>           }
>           ...
>         }
> * Nimbus would launch executors with a TaskInfo requesting the very same container, but
specifying a different command.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "curl -sSfL http://storm.server:1337/conf/storm.yaml -o /opt/storm/conf/storm.yaml
&& python /opt/storm/bin/storm supervisor storm.mesos.MesosSupervisor"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>               }
>             }
>             ...
>           }
>           ...
>         }
> While in the near term we expect container URLs to be pretty specific to the containerization
mechanism, let us hope for a glorious future with URLs like `img:///ubuntu-13.04` that point
to well-known, portable images.

This message was sent by Atlassian JIRA

View raw message