aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erb, Stephan" <>
Subject Re: Health Checks for Updates design review
Date Wed, 06 May 2015 08:33:07 GMT
Hi Maxim,

I am not keen on the potential risk of tasks getting stuck in STARTING. We perform auto-scaling
of jobs, so there might be nobody around to notice and correct the problem in time.

How about keeping the initial_interval_secs and just change its meaning to be grace period,
so that health checks are triggered but errors ignored during this interval.

The initial_interval_secs is then a user-configurable upper bound of when a job is meant to
be working. It can even be set rather high, because it won't affect the update performance.

What do you think?

Best Regards,
From: Maxim Khutornenko <>
Sent: Tuesday, May 5, 2015 10:24 PM
Subject: Health Checks for Updates design review


I have put together a design proposal for improving health-enabled job
update performance. Please, review and leave your comments:

View raw message