mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-540) Executor health checking.
Date Mon, 27 Oct 2014 18:53:33 GMT

     [ https://issues.apache.org/jira/browse/MESOS-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dominic Hamon updated MESOS-540:
--------------------------------
    Labels: newbie twitter  (was: newbie)

> Executor health checking.
> -------------------------
>
>                 Key: MESOS-540
>                 URL: https://issues.apache.org/jira/browse/MESOS-540
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Benjamin Mahler
>              Labels: newbie, twitter
>
> We currently do not health check running executors.
> At Twitter, this has led to out-of-band health checking of executors for an internal
framework.
> For the Storm framework, this has led to out-of-band health checking via ZooKeeper. Health
checking would allow Storm to use finer grained executors for better isolation.
> This also helps the Hadoop and Jenkins frameworks as well should health checking be desired.
> As for implementation, I would propose adding a call on the Executor interface:
> /**
>  * Invoked by the ExecutorDriver to determine the health of the executor.
>  * When this function returns, the Executor is considered healthy.
>  */
> void heartbeat(ExecutorDriver* driver) = 0;
> The driver can then heartbeat periodically and kill when the Executor is not responding
to heartbeats. The driver should also detect the executor deadlocking on any of the other
callbacks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message