mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Hindman (Commented) (JIRA)" <>
Subject [jira] [Commented] (MESOS-106) Failover timeout should default to 0
Date Mon, 19 Dec 2011 19:07:32 GMT


Benjamin Hindman commented on MESOS-106:

Actually, it would be sweet if we made that a constant in master/constants.hpp so that all
constants are defined there instead of at the actual configuration 'get' sites.
> Failover timeout should default to 0
> ------------------------------------
>                 Key: MESOS-106
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>         Attachments: MESOS-106.patch
> Since the failover timeout was added, you get a lot of weird behavior in clusters running
frameworks that don't support failover due to its long default value of 1 day. If a framework
fails or just exits without calling driver.stop(), all its executors stay around and consume
resources on the machines, causing subsequent runs to mysteriously fail to acquire resources.
See for an example. I know
that the failover timeout is supposed to eventually become a per-framework parameter anyway,
but in the meantime, the easiest way to prevent this is to set it to 0, because almost no
users have failover-enabled frameworks.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message