aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua Cohen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1632) Investigate executor fixes when Mesos 0.30.0 stops passing along environment variables
Date Thu, 10 Mar 2016 21:48:40 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190001#comment-15190001
] 

Joshua Cohen commented on AURORA-1632:
--------------------------------------

So I dug a little bit deeper into this problem and it seems that the cause was this: https://github.com/apache/aurora/blob/master/src/main/python/apache/thermos/core/process.py#L394-L399

We try and set PATH in the environment of the forked process, but PATH is no longer set in
our environment, so we end up (silently) raising a KeyError when we try to access it.

This raises two questions for me:

# Is PATH one of the environment variables that will *still* be passed to the executor after
this change (see Jie's response here indicating that some vars will still be passed: http://mail-archives.apache.org/mod_mbox/mesos-dev/201603.mbox/%3CCAJvN1BOq4aKNGZ5WEQLKp+kgCMaTwWQp8tdn37=16E-_+a+-jA@mail.gmail.com%3E)
# If the above is not true (i.e. $PATH will not be set in the executor's environment), do
tasks today expect Aurora to set PATH in their environment? Presumably they do, or at the
very least we cannot assume they do not. Given this, what value should we set PATH to?

> Investigate executor fixes when Mesos 0.30.0 stops passing along environment variables
> --------------------------------------------------------------------------------------
>
>                 Key: AURORA-1632
>                 URL: https://issues.apache.org/jira/browse/AURORA-1632
>             Project: Aurora
>          Issue Type: Task
>          Components: Executor
>            Reporter: Joshua Cohen
>            Priority: Blocker
>         Attachments: screenshot-1.png
>
>
> In the 0.30.0 release, the Mesos Agent will no longer implicitly pass along its environment
variables (see: http://mail-archives.apache.org/mod_mbox/mesos-dev/201603.mbox/%3CCAK7AWaGB24ALh8eb%2BvKMFgc4%2BjmhxZ6ry79HBcKN%2BBt04Sx43A%40mail.gmail.com%3E).
> I tested in vagrant by explicitly setting the {{--executor_environment_variables}} flag
on the agent to {{'{}'}} and verified that this does impact us. Initially we get a permission
denied error when trying to fork the runner:
> {noformat}
> I0310 16:36:21.048671 18103 thermos_task_runner.py:275] Forking off runner with cmdline:
 /var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/thermos_runner.pex
--setuid=vagrant --task_id=vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf
--log_to_disk=DEBUG --hostname=192.168.33.7 --thermos_json=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/task.json
--sandbox=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/sandbox
--log_dir=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c
--checkpoint_root=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/checkpoints
--process_logger_destination=file --port=aurora:31248 --port=http:31248
> F0310 16:36:21.057298 18103 aurora_executor.py:80] Task initialization failed: [Errno
13] Permission denied
> {noformat}
> This error can be addressed with the patch from this pull request: https://github.com/apache/aurora/pull/21.
However, even after applying this patch processes fail to fork (see attached screenshot).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message