incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (Updated) (JIRA)" <>
Subject [jira] [Updated] (MESOS-106) Failover timeout should default to 0
Date Wed, 21 Dec 2011 16:35:30 GMT


Matei Zaharia updated MESOS-106:

    Attachment: MESOS-106-v2.patch

Sure, here's an updated patch.
> Failover timeout should default to 0
> ------------------------------------
>                 Key: MESOS-106
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>         Attachments: MESOS-106-v2.patch, MESOS-106.patch
> Since the failover timeout was added, you get a lot of weird behavior in clusters running
frameworks that don't support failover due to its long default value of 1 day. If a framework
fails or just exits without calling driver.stop(), all its executors stay around and consume
resources on the machines, causing subsequent runs to mysteriously fail to acquire resources.
See for an example. I know
that the failover timeout is supposed to eventually become a per-framework parameter anyway,
but in the meantime, the easiest way to prevent this is to set it to 0, because almost no
users have failover-enabled frameworks.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message