mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kone <vinodk...@gmail.com>
Subject Re: Mesos Slave Port Change Fails Recovery
Date Thu, 02 Jul 2015 21:15:40 GMT
For slave recovery to work, it is expected to not change its config.

On Thu, Jul 2, 2015 at 2:10 PM, Philippe Laflamme <philippe@hopper.com>
wrote:

> Hi,
>
> I'm trying to roll out an upgrade from 0.20.0 to 0.21.0 with slaves
> configured with checkpointing and with "reconnect" recovery.
>
> I was investigating why the slaves would successfully re-register with the
> master and recover, but would subsequently be asked to shutdown ("health
> check timeout").
>
> It turns out that our slaves had been unintentionally configured to use
> port 5050 in the previous configuration. We decided to fix that during the
> upgrade and have them use the default 5051 port.
>
> This change seems to make the health checks fail and eventually kills the
> slave due to inactivity.
>
> I've confirmed that leaving the port to what it was in the previous
> configuration makes the slave successfully re-register and is not asked to
> shutdown later on.
>
> Is this a known issue? I haven't been able to find a JIRA ticket for this.
> Maybe it's the expected behaviour? Should I create a ticket?
>
> Thanks,
> Philippe
>

Mime
View raw message