mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-5376) Add systemd watchdog support
Date Mon, 11 Jul 2016 18:22:11 GMT

    [ https://issues.apache.org/jira/browse/MESOS-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371349#comment-15371349
] 

Jie Yu commented on MESOS-5376:
-------------------------------

Who will be the shepherd for this ticket? [~idownes], are you gonna shepherd this work?

> Add systemd watchdog support
> ----------------------------
>
>                 Key: MESOS-5376
>                 URL: https://issues.apache.org/jira/browse/MESOS-5376
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: David Robinson
>            Assignee: Lawrence Wu
>
> It would be great if Mesos had support for systemd's [watchdog|http://0pointer.de/blog/projects/watchdog.html].
Users would typically use a supervisor like [monit|https://mmonit.com/monit/] to check the
agent/master's /health endpoint and restart upon consecutive failures. Systemd doesn't support
polling services, it uses a watchdog to communicate liveliness instead. Supervisor solutions
like monit could be replaced with systemd if mesos had watchdog support. Note that simply
restarting the service upon failure (ie, when the process exits) is not sufficient -- a deadlock
within mesos would not cause the process to exit but a watchdog could detect this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message