mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <>
Subject [jira] [Assigned] (MESOS-1036) Implement a library for exposing statistical metrics.
Date Mon, 03 Mar 2014 19:48:25 GMT


Dominic Hamon reassigned MESOS-1036:

    Assignee: Dominic Hamon

> Implement a library for exposing statistical metrics.
> -----------------------------------------------------
>                 Key: MESOS-1036
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>          Components: statistics
>            Reporter: Benjamin Mahler
>            Assignee: Dominic Hamon
> At the current time, reporting of statistical metrics is dedicated to specific endpoints
for each component, primarily the following two:
> {noformat}
> /master/stats.json
> /slave/stats.json
> {noformat}
> Additional endpoints have not been added (for example, containerization statistics, allocator
statistics, libprocess statistics) due to the inherent difficulty involved: one must either
expose this data up to these higher level endpoints, or add a new endpoint for exposing the
component specific statistics.
> This is why the {{Statistics}} class in libprocess was created, however it is not being
used for any statistical reporting at the current time.
> [~benjaminhindman] and I had white-boarded the kinds of abstractions we wanted to build
to make statistical reporting trivial from anywhere in the code:
> Create the notion of a {{Statistic}} or {{Metric}} object that can be directly manipulated
to store statistics, for example:
> {code}
> // In the Registrar initialization:
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Recording an individual storage latency.
> storage_latency.set(latency);
> {code}
> In addition to this, we wanted the notion of a {{Meter}}, which automatically exposes
a metered version of a statistic, for example:
> {code}
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Adds "storage_latency_average" which computes average over the window.
> statistics.meter(storage_latency, Average());
> // Adds a "storage_latency_p99", percentile is a non-trivial implementation.
> statistics.meter(registrar_storage_latency, Percentile(99));
> // Adds a "storage_latency_maximum"
> statistics.meter(registrar_storage_latency, Maximum());
> {code}
> Of course, I'm not advocating a particular API in the above examples, I'm just laying
out the types of things we wanted to see available.
> As we add these types of abstractions, we will want to avoid storing large time series
data in memory as is currently done in {{Statistics}}. There are a number of things to consider
with respect to the windowing technique, but I think the notion of a window should transition
from "amount of history to be kept" to "a statistical rolling window". For example, when computing
an average, you would most likely want a rolling 1 minute average, as opposed to the average
for a 2 week window.
> Efficiency of this library will be important to avoid high RSS overhead.

This message was sent by Atlassian JIRA

View raw message