mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Bell <>
Subject Re: Detecting slave crashes event
Date Wed, 16 Sep 2015 18:11:37 GMT
Thank you, Benjamin.

So, I could periodically request the metrics endpoint, or stream the logs
(maybe via mesos.cli; or SSH)? What, roughly, does the "agent removed"
message look like in the logs?

Are there plans to offer a mechanism for event subscription?



On Wed, Sep 16, 2015 at 1:30 PM, Benjamin Mahler <>

> You can detect when we remove an agent due to health check failures via
> the metrics endpoint, but these are counters that are better used for
> alerting / dashboards for visibility. If you need to know which agents, you
> can also consume the logs as a stop-gap solution, until we offer a
> mechanism for subscribing to cluster events.
> On Wed, Sep 16, 2015 at 10:11 AM, Paul Bell <> wrote:
>> Hi All,
>> I am led to believe that, unlike Marathon, Mesos doesn't (yet?) offer a
>> subscribable event bus.
>> So I am wondering if there's a best practices way of determining if a
>> slave node has crashed. By "crashed" I mean something like the power plug
>> got yanked, or anything that would cause Mesos to stop talking to the slave
>> node.
>> I suppose such information would be recorded in /var/log/mesos.
>> Interested to learn how best to detect this.
>> Thank you.
>> -Paul

View raw message