mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kone <vi...@twitter.com>
Subject Re: Mesos master keeps adding and removing slave or just segfaults
Date Wed, 17 Apr 2013 01:41:03 GMT
Hi John,

You seem to have hit a couple of known issues:
https://issues.apache.org/jira/browse/MESOS-300
https://issues.apache.org/jira/browse/MESOS-435

Unfortunately, we haven't been able to reproduce these bugs consistently on
our end, so we were never able to find the root cause and fix :/ Please add
your data to the above tickets, so that we can diagnose/fix these.




@vinodkone


On Tue, Apr 16, 2013 at 6:21 AM, John B. Wyatt IV <jbwyatt4@gmail.com>wrote:

> Greetings,
>
> I've been spending some time trying to get the Mesos up and running on
> Vagrant (a nice frontend for headless Virtualbox). I have the master setup
> locally on 33.33.13.38:5050 and one slave setup on 33.33.13.39:5050. There
> able to communicate with each other and the web display on the master
> works. The problem is that the master keeps adding and removing the slave
> or just segfaults sometimes. The web interface doesn't register the slave
> (maybe removed too quickly?). I'm not too sure what to do at this point and
> I was hoping for some help. I'm using Mesos 0.10.
>
> Here is the output from the master:
>
> I0416 10:09:01.794397  2040 dominant_share_allocator.cpp:417] Performed
> allocation for 0 slaves in 0.018916 milliseconds
> I0416 10:09:02.099568  2038 master.cpp:906] Attempting to register slave on
> vagrant-ubuntu.vagrantup.com at slave(1)@127.0.1.1:57599
> I0416 10:09:02.100764  2038 master.cpp:1142] Master now considering a slave
> at vagrant-ubuntu.vagrantup.com:57599 as active
> I0416 10:09:02.101080  2038 master.cpp:1721] Adding slave
> 201304161008-16842879-5050-2023-56 at vagrant-ubuntu.vagrantup.com with
> cpus=2; mem=979; ports=[31000-32000]
> I0416 10:09:02.104706  2038 master.cpp:513] Slave
> 201304161008-16842879-5050-2023-56(vagrant-ubuntu.vagrantup.com)
> disconnected
> I0416 10:09:02.105237  2037 dominant_share_allocator.cpp:244] Added slave
> 201304161008-16842879-5050-2023-56 (vagrant-ubuntu.vagrantup.com) with
> cpus=2; mem=979; ports=[31000-32000] (and cpus=2; mem=979;
> ports=[31000-32000] available)
> I0416 10:09:02.105865  2037 dominant_share_allocator.cpp:435] Performed
> allocation for slave 201304161008-16842879-5050-2023-56 in 0.011817
> milliseconds
> I0416 10:09:02.106258  2037 dominant_share_allocator.cpp:269] Removed slave
> 201304161008-16842879-5050-2023-56
> I0416 10:09:02.797294  2038 dominant_share_allocator.cpp:417] Performed
> allocation for 0 slaves in 0.017615 milliseconds
> I0416 10:09:03.101245  2040 master.cpp:906] Attempting to register slave on
> vagrant-ubuntu.vagrantup.com at slave(1)@127.0.1.1:57599
> I0416 10:09:03.102088  2040 master.cpp:1142] Master now considering a slave
> at vagrant-ubuntu.vagrantup.com:57599 as active
> I0416 10:09:03.103230  2040 master.cpp:1721] Adding slave
> 201304161008-16842879-5050-2023-57 at vagrant-ubuntu.vagrantup.com with
> cpus=2; mem=979; ports=[31000-32000]
> I0416 10:09:03.106045  2040 master.cpp:513] Slave
> 201304161008-16842879-5050-2023-57(vagrant-ubuntu.vagrantup.com)
> disconnected
> I0416 10:09:03.106202  2039 dominant_share_allocator.cpp:244] Added slave
> 201304161008-16842879-5050-2023-57 (vagrant-ubuntu.vagrantup.com) with
> cpus=2; mem=979; ports=[31000-32000] (and cpus=2; mem=979;
> ports=[31000-32000] available)
> I0416 10:09:03.107240  2039 dominant_share_allocator.cpp:435] Performed
> allocation for slave 201304161008-16842879-5050-2023-57 in 0.011276
> milliseconds
> I0416 10:09:03.107650  2039 dominant_share_allocator.cpp:269] Removed slave
> 201304161008-16842879-5050-2023-57
> I0416 10:09:03.799612  2040 dominant_share_allocator.cpp:417] Performed
> allocation for 0 slaves in 0.024916 milliseconds
>
> Here is the output from the slave:
> I0416 10:19:46.207093  1867 main.cpp:123] Creating "process" isolation
> module
> I0416 10:19:46.209199  1867 main.cpp:131] Build: 2013-04-16 07:41:31 by
> vagrant
> I0416 10:19:46.209410  1867 main.cpp:132] Starting Mesos slave
> I0416 10:19:46.210247  1883 slave.cpp:175] Slave started on 1)@
> 127.0.1.1:56701
> I0416 10:19:46.210842  1883 slave.cpp:176] Slave resources: cpus=2;
> mem=979; ports=[31000-32000]
> I0416 10:19:46.213693  1883 slave.cpp:352] New master detected at
> master@33.33.13.38:5050
> Loading webui script at
> '/home/vagrant/mesos-0.10.0/src/webui/slave/webui.py'
> Bottle server starting up (using WSGIRefServer())...
> Listening on http://0.0.0.0:8081/
> Use Ctrl-C to quit.
>
> Sometimes the master just quits
>
> master:
> I0416 10:19:58.244128  2545 master.cpp:513] Slave
> 201304161019-16842879-5050-2531-12(vagrant-ubuntu.vagrantup.com)
> disconnected
> I0416 10:19:58.245954  2545 dominant_share_allocator.cpp:269] Removed slave
> 201304161019-16842879-5050-2531-12
> F0416 10:19:58.719403  2549 process.cpp:1828] Check failed:
> outgoing.count(s) > 0
> *** Check failure stack trace: ***
>     @     0x7f554933c0ad  google::LogMessage::Fail()
>     @     0x7f554933e83f  google::LogMessage::SendToLog()
>     @     0x7f554933bcab  google::LogMessage::Flush()
>     @     0x7f554933f0cd  google::LogMessageFatal::~LogMessageFatal()
>     @     0x7f5549227484  process::SocketManager::next()
>     @     0x7f55492216bf  process::send_data()
>     @     0x7f554937b9df  ev_invoke_pending
>     @     0x7f554937fd14  ev_loop
>     @     0x7f554922292c  process::serve()
>     @     0x7f5548a9ae9a  start_thread
>     @     0x7f5547fb5cbd  (unknown)
>
>
> Additional from slave:
> I0416 10:19:58.808632  1884 slave.cpp:1141] Process exited: @0.0.0.0:0
> W0416 10:19:58.808785  1884 slave.cpp:1144] WARNING! Master disconnected!
> Waiting for a new master to be elected.
>
>
> --
> John
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message