aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Huang <texasred2...@hotmail.com>
Subject Re: Review Request 51536: @ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled
Date Wed, 31 Aug 2016 17:38:09 GMT


> On Aug. 30, 2016, 9:25 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java, line 130
> > <https://reviews.apache.org/r/51536/diff/1/?file=1488791#file1488791line130>
> >
> >     Doing this based simply on the presence of a string feels brittle to me. Would
it make sense to define some basic form of "executor capabilities" passing? Even if it's something
simple like a comma delimited list of constants that we define in api.thrift (so that both
the scheduler and the executor can reuse the values)?

Do you mean adding executor capabilities as one attribute of TaskEvent in api.thrift? 

I think on the executor side , we can put (isHealthCheckEnabled() && isExecutorUpdated())
into one attribute called "watch_sec_skippable", and send it to the scheduler.


- Kai


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review147358
-----------------------------------------------------------


On Aug. 30, 2016, 8:52 p.m., Kai Huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> -----------------------------------------------------------
> 
> (Updated Aug. 30, 2016, 8:52 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> - Scheduler updater will not use watch_sec if health check is enabled
> - Add unit tests for successful updates.
> 
> This feature intends to improve reliability and performance of the Aurora scheduler job
updater by relying on health check status rather than watch_secs timeout when deciding an
individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
for more details and background.
> 
> 
> Assumptions on executor change:
> The executor starts health check at STARTING, if a successful health check is performed
before initial_interval_sec expires, the executor will sends a status message for RUNNING.
> 
> What I try to implement:
> vCurrent updater will infer the executor version from the presence of a status message
sent by the executor during RUNNING status update. 
> 
> For vPrev executor:
>     The vCurrent updater will use watch_sec for instance update, this is independent
from the health check.
> For vCurrent executor:
>     If health check is disabled on vCurrent executor, the vCurrent updater will use watch_sec
for instance update.
>     If health check is enabled on vCurrent executor, the vCurrent updater will not use
watch_sec for instance update.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java c129896d8cd54abd2634e2a339c27921042b0162

>   src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java c78c7fbd7d600586136863c99ce3d7387895efee

> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.InstanceUpdaterTest"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message