aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Cohen <jco...@apache.org>
Subject Re: Review Request 51298: A few executor fixes for filesystem isolation:
Date Thu, 25 Aug 2016 18:54:38 GMT


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/bin/thermos_executor_main.py, lines 117-124
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482575#file1482575line117>
> >
> >     Have you considered reading that value from the `MESOS_SANDBOX` environment
variable?
> >     
> >     From the docs: "MESOS_SANDBOX: Path to the mapped sandbox inside of the container
(determined by the agent flag sandbox_directory) for either mesos container with image or
docker container. For the case of command task without image specified, it is the path to
the sandbox on the host filesystem, which is identical to MESOS_DIRECTORY. MESOS_DIRECTORY
is always the sandbox on the host filesystem."

I did indeed consider this! However, in my testing the value of `MESOS_SANDBOX` matches the
value of `MESOS_DIRECTORY` for our method of isolating the task's filesystem. I believe this
is because the task is not actually specifying a `ContainerInfo`, but is instead mounting
a `Volume` with an `Image` set.

I'll add a comment explaining as much.


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/sandbox.py, line 257
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482576#file1482576line257>
> >
> >     I believe we don't need that `or mesos_directory` here. It should never happen
that sanbox_mount_point is None. And if it happens, it is better to crash early rather then
running into undefined behaviour later on.

Done, also now raising an error in `__init__` if the value is `None`.


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/sandbox.py, lines 266-270
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482576#file1482576line266>
> >
> >     We are executing this as root on the host filesystem. It might therefore make
sense to pass `-n` here to prevent that the `/etc/mtab` is modified.

I *believe* that Mesos is already running the executor in its own mount namespace, so there's
no chance of modifying the host's /etc/mtab.


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/thermos/core/process.py, lines 189-200
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482578#file1482578line189>
> >
> >     The `ProcessBase` class is normaly oblivious to the existance of Mesos. Do you
feel it would make sense to inline this special case at the call-side?
> >     
> >     I haven't checked that, but I suppose in its current place we would leak the
containerizer launch call in the Thermos UI?

It does not leak the containerizer launch call to the Thermos UI. The observer takes a circuitous
path, but essentially loads up the contents of task.json from the sandbox which has the raw
cmdline and not the value generated here that's wrapped by `mesos-containerizer launch`.

As to your larger question, I'm not sure I follow your meaning. As part of refactoring to
properly support .thermos_profile, I moved the mesos-specific logic from ProcessBase to Process,
but that doesn't feel like a major shift. Are you instead saying that everything Mesos-related
should live under apache.aurora rather apache.thermos? I can see the case for that, but I
feel like that split is artificial and just adds to complexity. We don't have any intentions
of supporting Thermos as a standalone process manager and I'd rather not make this already
complex change even more complex by adding the need to plumb this command through to process
from who knows where ;).


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/thermos/core/process.py, line 420
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482578#file1482578line420>
> >
> >     This will point to a wrong path in the `taskfs_isolated` case as the profile
will be sourced by bash before calling the containerize launch command. At that point in time
we are still in the host filesystem.
> >     
> >     Might be worthwhile to add a test to check that it is working correctly.

I fixed support for .thermos_profile and added coverage to the e2e tests to make sure it works.


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/thermos/core/process.py, line 437
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482578#file1482578line437>
> >
> >     In the `taskfs_isolated` case, `cwd` will be set to a path within the container,
but we use it before the pivot root is completed. This is a problem similar to the thermos_profile
issue mentioned above.

I'm not sure what you're getting at here? The `cwd` arg is passed on to mesos-containerizer
as the value for the `--working_directory` flag. That value is used to `chdir` after the pivot_root/chroot
logic is run. I've also thoroughly verified that this directory is respected by the running
task (original by the `create_datafile` process in the e2e test, now replaced by the `setup_env`
process).


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/thermos/runner/thermos_runner.py, lines 139-144
> > <https://reviews.apache.org/r/51298/diff/2/?file=1482579#file1482579line139>
> >
> >     Can this be changed in a meaningful way? I have always assumed that the observer
UI has the hardcoded assumption that logs can be found within `sandbox/.logs`.

You're right, this was working only by coincidence (because the value being passed was a bind
mount of the sandbox). I cleaned this up which uncovered a few more bugs (e.g. the runner
was ensuring the sandbox directory existed, so when we passed in /mnt/mesos/sandbox as the
value, that path was created on the host filesystem).


> On Aug. 24, 2016, 11:21 a.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/sandbox.py, line 36
> > <https://reviews.apache.org/r/51298/diff/1-2/?file=1481195#file1481195line36>
> >
> >     Please extend this doc string slightly to mention that this path is within the
host filesystem.

Done.


- Joshua


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51298/#review146619
-----------------------------------------------------------


On Aug. 25, 2016, 6:54 p.m., Joshua Cohen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51298/
> -----------------------------------------------------------
> 
> (Updated Aug. 25, 2016, 6:54 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> - Add an option to skip the groupadd/useradd calls into the task's filesystem.
> - Mount any configured volumes into the task's filesystem.
> - Clean up http server script used by appc e2e tests.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py dde19a6914c7c7b2178220707f242f61f11f38bd

>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 65a495d5c50d91b38c4328bab3bfec667f6a7ba9

>   src/main/python/apache/aurora/executor/common/sandbox.py 5f091af7636bd94f028f15d63437e305b02f741c

>   src/main/python/apache/aurora/executor/thermos_task_runner.py 1d713ca3d81a2e7be88b787dfcab328d887be24c

>   src/main/python/apache/thermos/core/process.py a296fa715ef6fbb8d5feae356914334437f353f1

>   src/main/python/apache/thermos/core/runner.py dcfc190f7ed529e4f816e02b2d969077c60b008f

>   src/main/python/apache/thermos/runner/thermos_runner.py 441bacdfe93b1805a03a1216762c74db810a9540

>   src/test/python/apache/aurora/executor/common/test_sandbox.py ce989b1ccda0f1bc7ba9e15dfe4be20116db3491

>   src/test/python/apache/aurora/executor/test_thermos_executor.py 06601df3bc355a690ff1789b2e2e34484fadefe9

>   src/test/python/apache/thermos/core/test_process.py 759f783202803c296ce19bb64c59cbe896d40a43

>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora b69ddf129c0015a878b089d85d731bc0c26fd55c

>   src/test/sh/org/apache/aurora/e2e/run-server.sh 76939888bed2e8138671d97f7bc56fd5641008e4

>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 0404d0e0a1a1054a5ed0c0b9f5f5be9fb3ecc113

> 
> Diff: https://reviews.apache.org/r/51298/diff/
> 
> 
> Testing
> -------
> 
> ./build-support/jenkins/build.sh
> e2e tests
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message