mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bianchi <jazzist...@gmail.com>
Subject Re: removed slace "ID": (131.154.96.172): health check timed out
Date Tue, 19 Apr 2016 07:22:47 GMT
Actualli the majority of these settings i have already done, out of
/etc/mesos-master/ip, here should i write the ip of master interface ? And
/etc/mesos-slave/ip, here i should write the ip of slave interface ?
Your suggest seems the right one, because if i try to ping some machines
from a network to another someone is reachable some other don't, and these
latters sometimes, commonly at boot, are able to ping and aftwr a while
dont.
Thanks for your suggestion i m going to try.
Il 19/apr/2016 03:12, "Dick Davies" <dick@hellooperator.net> ha scritto:

> On our network a lot of the hosts have multiple interfaces, which let
> some asymmetric routing
> issues creep in that prevented our masters replying to slaves, which
> reminded me of your symptoms.
>
> So we set an IP address in /etc/mesos-slave/ip and
> /etc/mesos-master/ip so that they only listen
> on one interface, and then check connectivity between those IPs.
>
> The Ansible repo we use to build the stack now has a 'signoff'
> playbook to check network connectivity
> is correct between the services it deploys to a new environment.
>
> It won't be much use to you on its own I'm afraid, but
> here's a checklist cribbed from that playbook (ports might be
> different in your setup).
>
> You can SSH to the servers and check reachability between them with
> netcat or telnet.
>
>
> zookeepers:
>
> - need to be able to reach each other on the election port (usually
> tcp/3888)
>
> masters:
>
> * must be able to reach zookeepers on tcp/2181
> * must be able to reach each other on tcp/5050
> * must be able to reach slaves on tcp/5051
>
> mesos slaves:
>
> - must be able to reach masters on tcp/5050
> - must be able to reach zookeepers on tcp/2181
> - another other connectivity to services your application needs
> (database, caches, whatever)
>
> I think that's it.
>
> On 18 April 2016 at 20:39, Stefano Bianchi <jazzista88@gmail.com> wrote:
> > Hi Dick Davies
> >
> > Could you please share your solution?
> > How did you set up mesos/Zookeeper to interconnect masters and slaves
> among
> > networks?
> >
> > Thanks a lot!
> >
> > 2016-04-18 20:56 GMT+02:00 Dick Davies <dick@hellooperator.net>:
> >>
> >> +1 for that theory, we had some screwy issues when we tried to span
> >> subnets until we set every slave and master
> >> to listen on a specific IP so we could tie down routing correctly.
> >>
> >> Saw very similar symptoms that have been described.
> >>
> >> On 18 April 2016 at 18:35, Alex Rukletsov <alex@mesosphere.com> wrote:
> >> > I believe it's because slaves are able to connect to the master, but
> the
> >> > master is not able to connect to the slaves. That's why you see them
> >> > connected for some time and gone afterwards.
> >> >
> >> > On Mon, Apr 18, 2016 at 6:47 PM, Stefano Bianchi <
> jazzista88@gmail.com>
> >> > wrote:
> >> >>
> >> >> Indeed, i dont know why, i am not able to reach all the machines
> from a
> >> >> network to the other, just some machines can interconnect with some
> >> >> others
> >> >> among the networks.
> >> >> On mesos i see that all the slaves at a certain time are all
> connected,
> >> >> then disconnected and after a while connected again, it seems like
> they
> >> >> are
> >> >> able to connect for a while.
> >> >> However is an openstack issue i guess.
> >> >>
> >> >> Does this also happen when master3 is leading? My guess is that
> you're
> >> >> not
> >> >> allowong incoming connections from master1 and master2 to slave3.
> >> >> Generally,
> >> >> masters should be able to connect to slaves, not just respond to
> their
> >> >> requests.
> >> >>
> >> >> On 18 Apr 2016 13:17, "Stefano Bianchi" <jazzista88@gmail.com>
> wrote:
> >> >>>
> >> >>> Hi
> >> >>> On openstack i plugged two virtual networks to the same virtual
> router
> >> >>> so
> >> >>> that the hosts on the 2 networks can communicate each other.
> >> >>> this is my topology:
> >> >>>
> >> >>> -----------------------internet-----------------------
> >> >>>                                 |
> >> >>>                            Router1
> >> >>>                                 |
> >> >>> --------------------------------------------------------
> >> >>> |                                                             
   |
> >> >>> Net1                                                        Net2
> >> >>> Master1 Master2                                     Master3
> >> >>> Slave1 slave2                                          Slave3
> >> >>>
> >> >>> I have set zookeeper in with this line:
> >> >>>
> >> >>> zk://Master1_IP:2181,Master2_IP:2181,Master3_IP:2181/mesos
> >> >>>
> >> >>> The 3 masters, even though on 2 separated networks, elect the leader
> >> >>> correclty.
> >> >>> Now i have started the slaves, and in a first time i see all 3
> >> >>> correctly
> >> >>> registered, but after a while the slave 3, independently form who
is
> >> >>> the
> >> >>> master, disconnects.
> >> >>> I saw in the log and i get the message in the object.
> >> >>> Can you help me to solve this problem?
> >> >>>
> >> >>>
> >> >>> Thanks to all.
> >> >
> >> >
> >
> >
>

Mime
View raw message