aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Huang <texasred2...@hotmail.com>
Subject Re: Review Request 51536: @ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled
Date Wed, 07 Sep 2016 22:06:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
-----------------------------------------------------------

(Updated Sept. 7, 2016, 10:06 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Summary (updated)
-----------------

@ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled


Bugs: AURORA-894
    https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
-------

- Scheduler updater will not use watch_sec if health check is enabled.

This feature intends to improve reliability and performance of the Aurora scheduler job updater
by relying on health check status rather than watch_secs timeout when deciding an individual
instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
for more details and background.

After discussion on Aurora dev list, we decided to keep the watch_secs infrastructure intact
on scheduler side. Our final conclusion is that we adopt the following implementation:

1. If the users want purely health checking driven updates they can set watch_secs to 0 and
enable health checks.

2. If they want to have both health checking and time driven updates they can set watch_secs
to the time that they care about, and doing health checks at STARTING state as well.

3. If they just want time driven updates they can disable health checking and set watch_secs
to the time that they care about.

In this review, there will be only one scheduler change: 
Currently scheduler does not accept zero value for watch_secs, we need to relax this constraint.

Executor change to do (in a separate review):
The executor starts health check at STARTING, if a successful health check is performed before
initial_interval_sec expires, the executor will sends a status message for RUNNING.


Diffs
-----

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa

  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 04551f17999d742c53dfb1b36286b119b448550e


Diff: https://reviews.apache.org/r/51536/diff/


Testing
-------

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message