Return-Path: X-Original-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C150CACD for ; Tue, 8 May 2012 03:41:37 +0000 (UTC) Received: (qmail 38152 invoked by uid 500); 8 May 2012 03:41:37 -0000 Delivered-To: apmail-incubator-mesos-dev-archive@incubator.apache.org Received: (qmail 38129 invoked by uid 500); 8 May 2012 03:41:37 -0000 Mailing-List: contact mesos-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mesos-dev@incubator.apache.org Delivered-To: mailing list mesos-dev@incubator.apache.org Received: (qmail 38104 invoked by uid 99); 8 May 2012 03:41:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2012 03:41:36 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2012 03:41:32 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 761CF438D5C for ; Tue, 8 May 2012 03:41:12 +0000 (UTC) Date: Tue, 8 May 2012 03:41:12 +0000 (UTC) From: "jiraposter@reviews.apache.org (JIRA)" To: mesos-dev@incubator.apache.org Message-ID: <143668859.37333.1336448472514.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1153712837.17694.1334247080891.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MESOS-183) Included MPI Framework Fails to Start MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MESOS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270162#comment-13270162 ] jiraposter@reviews.apache.org commented on MESOS-183: ----------------------------------------------------- ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4768/#review7666 ----------------------------------------------------------- Ship it! I'll get this checked in provided Jessica gives it a "Ship It". Thanks the the good work here, I intend to make it a demonstration of how to write frameworks on Mesos! - Benjamin On 2012-05-08 01:29:06, Harvey Feng wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4768/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-05-08 01:29:06) bq. bq. bq. Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica. bq. bq. bq. Summary bq. ------- bq. bq. Some updates to point out: bq. bq. -nmpiexec.py bq. -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 'driver.stop()' to statusUpdate() so that it stops when all tasks have been finished, which occurs when the executor's launched mpd processes have all exited. bq. -startmpd.py bq. -> Didn't remove cleanup(), and added code in shutdown() that manually kills mpd processes. They might be useful during abnormal (cleanup) and normal (shutdown) framework/executor termination...I think. cleanup() still terminates all mpd's in the slave, but shutdown doesn't. bq. -> killtask() stops the mpd associated with the given tid. bq. -> Task states update nicely now. They correspond to the state of a task's associated mpd process. bq. -Readme bq. -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec on OS X and Ubuntu/Linux bq. bq. bq. This addresses bug MESOS-183. bq. https://issues.apache.org/jira/browse/MESOS-183 bq. bq. bq. Diffs bq. ----- bq. bq. frameworks/mpi/README.txt cdb4553 bq. frameworks/mpi/nmpiexec 517bdbc bq. frameworks/mpi/nmpiexec.py a5db9c0 bq. frameworks/mpi/startmpd.py 8eeba5e bq. frameworks/mpi/startmpd.sh 44faa05 bq. bq. Diff: https://reviews.apache.org/r/4768/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Harvey bq. bq. > Included MPI Framework Fails to Start > ------------------------------------- > > Key: MESOS-183 > URL: https://issues.apache.org/jira/browse/MESOS-183 > Project: Mesos > Issue Type: Bug > Components: documentation, framework > Environment: Scientific Linux Cluster > Reporter: Jessica J > Assignee: Harvey Feng > Priority: Blocker > Labels: documentation, mpi, setup > > There are really two facets to this issue. The first is that no good documentation exists for setting up and using the included MPI framework. The second, and more important issue, is that the framework will not run. The second issue is possibly related to the first in that I may not be setting it up properly. > To test the MPI framework, by trial and error I determined I needed to run python setup.py build and python setup.py install in the MESOS-HOME/src/python directory. Now when I try to run nmpiexec -h, I get an AttributeError, below: > Traceback (most recent call last): > File "./nmpiexec.py", line 2, in > import mesos > File "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos.py", line 22, in > import _mesos > File "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos_pb2.py", line 1286, in > DESCRIPTOR.message_types_by_name['FrameworkID'] = _FRAMEWORKID > AttributeError: 'FileDescriptor' object has no attribute 'message_types_by_name' > I've examined setup.py and determined that the version of protobuf it includes (2.4.1) does, indeed, contain a FileDescriptor class in descriptor.py that sets self.message_types_by_name, so I'm not sure what the issue is. Is this a bug? Or is there a step I'm missing? Do I need to also build/install protobuf? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira