aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: non-prod SLA stats
Date Fri, 29 May 2015 17:23:03 GMT
Hi Stephan,

Tracking the same set of metrics for all non-prod jobs could be
somewhat expensive on both collection and consumption sides. The only
metrics we currently chose to collect are MTTA/R to help us monitor
scheduling rate in view of reduced cluster capacity (AURORA-774).
Perhaps we could put non-prod collection behind a set of command line
switches (Arg<Boolean>)? E.g.:

SLA_COLLECT_NON_PROD_MEDIANS
SLA_COLLECT_NON_PROD_JOB_UPTIMES
SLA_COLLECT_NON_PROD_PLATFORM_UPTIMES

These could be defined in SlaModule and injected into MetricCalculator
to let us finely tune the required non-prod collection set. What do
you think?

Thanks,
Maxim

On Fri, May 29, 2015 at 7:09 AM, Erb, Stephan
<Stephan.Erb@blue-yonder.com> wrote:
> Hi everyone,
>
> we are are interested in the job uptime percentiles and the aggregate cluster uptime
percentage not only for production jobs, but also for our non-production jobs.
>
> Are there any reasons why those are not available in a non-prod version, similar to the
current handling of mtta and mttr [1]?  If there are no objections, I will prepare a patch.
>
> Regards,
> Stephan
>
> [1] https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/sla/MetricCalculator.java#L69

Mime
View raw message