mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From " (JIRA)" <>
Subject [jira] [Commented] (MESOS-183) Included MPI Framework Fails to Start
Date Tue, 08 May 2012 12:21:51 GMT

] commented on MESOS-183:

bq.  On 2012-05-08 03:39:29, Benjamin Hindman wrote:
bq.  > I'll get this checked in provided Jessica gives it a "Ship It". Thanks the the good
work here, I intend to make it a demonstration of how to write frameworks on Mesos!

Scratch that. I voted to ship it and then remembered an issue that I don't think has been
addressed yet. I posted this on the jira, but I haven't seen any changes for it: 

I'm running into the setuptools issue addressed in the test python framework:
The locations of the eggs added to PYTHONPATH in nmpiexec [now mpiexec-mesos?] need to be
updated so that the Mesos/protobuf libraries (and setuptools) don't have to be installed on
every node. 

There also seems to be an issue with Python detecting the Mesos module from the egg in src/python/dist--I
couldn't import mesos until I unzipped the egg, no matter what directory I was in or how I
modified the PYTHONPATH. [Update: I believe it's related to the fact that the mesos egg uses
C/C++ extensions. I think it needs to use a setuptools module to list the package contents.]

- Jessica

This is an automatically generated e-mail. To reply, visit:

On 2012-05-08 01:29:06, Harvey Feng wrote:
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  -----------------------------------------------------------
bq.  (Updated 2012-05-08 01:29:06)
bq.  Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica.
bq.  Summary
bq.  -------
bq.  Some updates to point out:
bq.    -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 'driver.stop()'
to statusUpdate() so that it stops when all tasks have been finished, which occurs when the
executor's launched mpd processes have all exited. 
bq.    -> Didn't remove cleanup(), and added code in shutdown() that manually kills mpd
processes. They might be useful during abnormal (cleanup) and normal (shutdown) framework/executor
termination...I think. cleanup() still terminates all mpd's in the slave, but shutdown doesn't.

bq.    -> killtask() stops the mpd associated with the given tid. 
bq.    -> Task states update nicely now. They correspond to the state of a task's associated
mpd process.
bq.  -Readme
bq.    -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec on OS
X and Ubuntu/Linux
bq.  This addresses bug MESOS-183.
bq.  Diffs
bq.  -----
bq.    frameworks/mpi/README.txt cdb4553 
bq.    frameworks/mpi/nmpiexec 517bdbc 
bq.    frameworks/mpi/ a5db9c0 
bq.    frameworks/mpi/ 8eeba5e 
bq.    frameworks/mpi/ 44faa05 
bq.  Diff:
bq.  Testing
bq.  -------
bq.  Thanks,
bq.  Harvey

> Included MPI Framework Fails to Start
> -------------------------------------
>                 Key: MESOS-183
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>          Components: documentation, framework
>         Environment: Scientific Linux Cluster
>            Reporter: Jessica J
>            Assignee: Harvey Feng 
>            Priority: Blocker
>              Labels: documentation, mpi, setup
> There are really two facets to this issue. The first is that no good documentation exists
for setting up and using the included MPI framework. The second, and more important issue,
is that the framework will not run. The second issue is possibly related to the first in that
I may not be setting it up properly. 
> To test the MPI framework, by trial and error I determined I needed to run python
build and python install in the MESOS-HOME/src/python directory. Now when I try to
run nmpiexec -h, I get an AttributeError, below: 
> Traceback (most recent call last):
>   File "./", line 2, in <module>
>     import mesos
>   File "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/",
line 22, in <module>
>     import _mesos
>   File "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/",
line 1286, in <module>
>     DESCRIPTOR.message_types_by_name['FrameworkID'] = _FRAMEWORKID
> AttributeError: 'FileDescriptor' object has no attribute 'message_types_by_name'
> I've examined and determined that the version of protobuf it includes (2.4.1)
does, indeed, contain a FileDescriptor class in that sets self.message_types_by_name,
so I'm not sure what the issue is. Is this a bug? Or is there a step I'm missing? Do I need
to also build/install protobuf?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message