aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Huang <>
Subject Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)
Date Wed, 14 Sep 2016 00:40:22 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Sept. 14, 2016, 12:40 a.m.)

Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Modify the change list item.

Bugs: AURORA-1225

Repository: aurora

Description (updated)

Modify executor state transition logic to rely on health checks (if enabled).

Executor needs to start executing user content in STARTING and transition to RUNNING when
a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health check driven updates.
It gives a more complete context of the design of this feature.

Please see this epic:
and the design doc:
for more details and background.

If health check is enabled on vCurrent executor, the health checker will send a "TASK_RUNNING"
message when a successful required number of health checks is reached within the initial_interval_secs.
On the other hand, a "TASK_FAILED" message was sent if if fail to satisift the required number
of health checks within the initial_interval_secs, or a maximum number of failed health check
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends "TASK_RUNNING" message
to scheduler, so the behavior will be the same as the vPrev executor.

[Change List]
The current change set includes:
1. Removed the status memoization in ChainedStatusChecker.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.

Currently I fixed all broken tests caused by my changes. However, more tests needs to to be
added to accomodate to the executor change. I will send follow-up review update when I cover
more edge cases. But any feedback on implementation is highly appreciated.


  src/main/python/apache/aurora/executor/ ce5ef680f01831cd89fced8969ae3246c7f60cfd

  src/main/python/apache/aurora/executor/common/ 5fc845eceac6f0c048d7489fdc4c672b0c609ea0

  src/main/python/apache/aurora/executor/common/ 795dae2d6b661fc528d952c2315196d94127961f

  src/main/python/apache/aurora/executor/ 228a99a05f339e21cd7e769a42b9b2276e7bc3fc

  src/test/python/apache/aurora/executor/common/ bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e

  src/test/python/apache/aurora/executor/common/ 5be1981c8c8e88258456adb21aa3ca7c0aa472a7

  src/test/python/apache/aurora/executor/ ce4679ba1aa7b42cf0115c943d84663030182d23

  src/test/python/apache/aurora/executor/ 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e




./pants test.pytest src/test/python/apache/aurora/executor::


Kai Huang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message