mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Conway <>
Subject Re: Registering and framework failover
Date Wed, 13 Jul 2016 12:50:29 GMT
On Wed, Jul 13, 2016 at 2:44 PM, Evers Benno <> wrote:
> imagine the following situation: I am a framework with failover timeout
> of 1 hour, and 59 minutes and 55 seconds after shutting down I want to
> register with the master again.
> If my registration attempt arrives at the master within the time limit
> everything will be fine and I even get back the old tasks for
> reconciliation, but if it arrives slightly later the framework id is
> permanently blocked by mesos, and I am not able to register. Instead, I
> will receive an error()-callback with the message "Framework has been
> removed".

Right: if you set a failover_timeout of 1 hour, your framework is
expected to reregister within one hour. If it does not, all of its
tasks will be killed and you need to start over with a new
FrameworkID. Can you clarify which aspect of this behavior is
problematic for you?

Note that a failover_timeout of 1 hour is probably a little low.

> Is there any way to reliably connect to the master while also
> reconciling old tasks if possible?

Sorry, not sure what you mean by this.


View raw message