mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "brian wickman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-857) restructure mesos python namespace
Date Wed, 30 Apr 2014 17:44:18 GMT

    [ https://issues.apache.org/jira/browse/MESOS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985801#comment-13985801
] 

brian wickman commented on MESOS-857:
-------------------------------------

I propose that we restructure the mesos python project.  Right now it's fractured haphazardly,
yet there are idioms made available by the python packaging ecosystem to do this correctly.

For example, there is src/cli which is a mishmash of C++ and python, which contains a redeclaration
of 'mesos' in unpackaged form which would conflict with the existing code in src/python. 
Now src/python bundles mesos_pb2.py, mesos.py and _mesos.so in a top-level namespace.  Ordinarily
if you 'pip install baz', you expect one top level package name and everything residing underneath,
e.g. 'import baz' with baz.foo, baz.bar, baz.bak subpackages.

We should structure the mesos namespace such that bits and pieces of mesos can be installed
a la carte.  Right now you have to go all-in, bringing in C extensions (which are challenging
to build and have no pure source distribution available yet) which is a hindrance for adoption.

It seems reasonable that I might just want API stubs or the code-generated protobuf classes
or just the CLI.  We can do this in a few ways, but it means splitting everything into different
packages with dependencies between each (codified by "install_requires" in setup.py.)  The
following proposal uses a top-level 'mesos' namespace package, but it could be done with separate
top-level packages, e.g. mesos_api, mesos_driver, instead of mesos.api or mesos.driver.

I propose the following packages (which would also mirror the import namespace):

{noformat}
  mesos [nspkg]
  mesos.api [pkg]
  mesos.cli [pkg]
  mesos.driver [pkg]
  mesos.native [pkg]
  mesos.protocol [pkg]
{noformat}

mesos should be a namespace package: it contains no symbols.  But by default it would have
install_requires on everything provided within the mesos project, so that 'pip install mesos'
does approximately the correct thing.  But in and of itself, it would contain no sources.

mesos.api should contain just the Scheduler, SchedulerDriver, Executor, ExecutorDriver (and
in the future, possibly Log, LogDriver, Containerizer, ContainerizerDriver) stubs.  it has
no dependencies on anything else.

mesos.cli should contain all the CLI commands.  it also shouldn't need to depend on any other
packages except maybe mesos.protocol.  we can use the console_scripts entry point in mesos.cli
to handle script installation (see http://www.scotttorborg.com/python-packaging/command-line-scripts.html#the-console-scripts-entry-point
).  this means 'pip install mesos.cli' would create wrapper scripts for mesos-cat, mesos-ps,
etc, that correctly invoke the underlying python modules with all the dependencies set up
correctly, and put onto the $PATH in the same place as your python interpreter.

mesos.driver should be a package that is a small wrapper around pkg_resources find_packages
+ get_entry_map and used to detect any python packages in the environment exporting concrete
driver implementations (e.g. _mesos.MesosSchedulerDriver or _mesos.MesosExecutorDriver.) 
this would be done via EntryPoints (see https://pythonhosted.org/setuptools/pkg_resources.html#entry-points
)

mesos.native should be the package that contains _mesos.so and entry_point metadata expected
by mesos.driver in the setup.py.  we could even go so far as to publish mesos.native.el5 or
mesos.native.el6 binary wheels to PyPI in order to differentiate linux ABIs, but have them
correctly detected and picked up by mesos.driver at runtime.  this strategy is also compatible
with the pesos project (https://github.com/wickman/pesos ), which would just publish PesosSchedulerDriver
and PesosExecutorDriver entry points for mesos.driver, allowing a pure python scheduler or
executor to be implemented.

finally, mesos.protocol would be the package containing all of the code-generated protobuf
stubs.  we could even split mesos.protocol out as a namespace package with separate subpackages
for mesos.protocol.pb, mesos.protocol.json.  currently protobuf only supports python 2.x (there
are some branches out there with support for 3.x but afaik there is no plan for those to reach
master.)  mesos.protocol.pb would have an install_requires on protobuf, and mesos.protocol.json
would be dependency-free, and hence friendly with python 3.x.  ideally there would be helper
messages for constructing the body of libprocess messages (the "wire protocol".)  in the future
that could be ported over to the Event/Call interface that Ben has described.

in order to support legacy applications, we could have the mesos.legacy package, which would
map all the above names into their _mesos, mesos_pb2 and mesos.* counterparts.

> restructure mesos python namespace
> ----------------------------------
>
>                 Key: MESOS-857
>                 URL: https://issues.apache.org/jira/browse/MESOS-857
>             Project: Mesos
>          Issue Type: Improvement
>          Components: python api
>            Reporter: brian wickman
>
> Right now the mesos_pb2 and mesos dependencies are bundled together into the mesos egg.
We have some tooling that uses just the compiled protobufs, but because they're lumped together
with the mesos egg, we get all the dependency/platform nightmare that comes along with it,
not to mention the bloat of including 20MB of .so files.  This proposes splitting the mesos
protobufs into a separate mesos_pb distribution that the mesos distribution should depend
upon via install_requires (e.g. "mesos_pb==0.15.0-rc4")



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message