mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harvey Feng" <h.f...@berkeley.edu>
Subject Re: Review Request: Updates and additions to the MPI framework
Date Fri, 01 Jun 2012 10:21:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/
-----------------------------------------------------------

(Updated 2012-06-01 10:21:09.375357)


Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica.


Changes
-------

L59 should fix the start mpiexec problem


Summary
-------

Some updates to point out:

-nmpiexec.py
  -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 'driver.stop()'
to statusUpdate() so that it stops when all tasks have been finished, which occurs when the
executor's launched mpd processes have all exited. 
-startmpd.py
  -> Didn't remove cleanup(), and added code in shutdown() that manually kills mpd processes.
They might be useful during abnormal (cleanup) and normal (shutdown) framework/executor termination...I
think. cleanup() still terminates all mpd's in the slave, but shutdown doesn't. 
  -> killtask() stops the mpd associated with the given tid. 
  -> Task states update nicely now. They correspond to the state of a task's associated
mpd process.
-Readme
  -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec on OS X and
Ubuntu/Linux


This addresses bug MESOS-183.
    https://issues.apache.org/jira/browse/MESOS-183


Diffs (updated)
-----

  frameworks/mpi/README.txt cdb4553 
  frameworks/mpi/mpiexec-mesos PRE-CREATION 
  frameworks/mpi/mpiexec-mesos.py PRE-CREATION 
  frameworks/mpi/nmpiexec 517bdbc 
  frameworks/mpi/nmpiexec.py a5db9c0 
  frameworks/mpi/startmpd.py 8eeba5e 
  frameworks/mpi/startmpd.sh 44faa05 

Diff: https://reviews.apache.org/r/4768/diff


Testing
-------


Thanks,

Harvey


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message