mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Nielsen <nik...@mesosphere.io>
Subject Re: Problem with start MPD when using MPI with Mesos
Date Thu, 23 Apr 2015 22:23:28 GMT
We found that newer versions of MPICH2 doesn't use MPD as a launcher. Try
to look in the executor sandbox for the error message.

I would try mesos-hydra or github.com/nqn/gasc instead :)

Niklas

On 23 April 2015 at 11:23, Chong Chen <Chong.Chen2@huawei.com> wrote:

>  Hi,
>
>
>
> I installed MPICH2-1.2 and Mesos 0.20.1, when I try to run a example MPI
> job with mesos, the MPD is unable to start properly. Here is the output of
> my test as follow.
>
> Could you please help me to fix this problem? Thanks!
>
>
>
> Best Regards,
>
> Chong
>
>
>
> bash ./mpiexec-mesos 127.0.0.1:5050 ./hello
>
> Connecting to Mesos master 127.0.0.1:5050
>
> MPD_PID is Carmel-5_32927
>
> I0423 11:15:55.252918 70968 sched.cpp:139] Version: 0.20.1
>
> I0423 11:15:55.254184 71016 sched.cpp:235] New master detected at
> master@127.0.0.1:5050
>
> I0423 11:15:55.254294 71016 sched.cpp:243] No credentials provided.
> Attempting to register without authentication
>
> I0423 11:15:55.254957 71039 sched.cpp:409] Framework registered with
> 20150423-110655-16777343-5050-68155-0003
>
> Mesos MPI scheduler and mpd running at Carmel-5:32927
>
> Registered with framework ID 20150423-110655-16777343-5050-68155-0003
>
> Got 1 resource offers
>
> Considering resource offer 20150423-110655-16777343-5050-68155-87 from
> Carmel-5
>
> Accepting offer on Carmel-5 to start mpd 0
>
> Replying to offer: launching mpd 0 on host Carmel-5
>
> We've launched all our MPDs; waiting for them to come up
>
> ...waiting on MPD(s)...
>
> Task 0 in state 1
>
> ...waiting on MPD(s)...
>
> Task 0 in state 3
>
> A task finished unexpectedly, calling mpdexit on Carmel-5_32927
>
> Got 1 resource offers
>
> Considering resource offer 20150423-110655-16777343-5050-68155-88 from
> Carmel-5
>
> Declining permanently because we have already launched enough tasks
>
> I0423 11:15:56.777765 71038 sched.cpp:747] Stopping framework
> '20150423-110655-16777343-5050-68155-0003'
>
>
>
>
>

Mime
View raw message