aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua Cohen" <jco...@apache.org>
Subject Re: Review Request 41154: Add support for performing health checks with a shell command.
Date Fri, 11 Dec 2015 01:28:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41154/#review109901
-----------------------------------------------------------



docs/configuration-reference.md (lines 387 - 389)
<https://reviews.apache.org/r/41154/#comment169580>

    Does anyone think it would be worth a deprecation cycle to rename these http-specific
parameters to have an http prefix? It seems like it would be a major hassle, but also potentially
seems like the right thing to do?
    
    We could, potentially add something to rewrite tasks using the current values to the new
values as well?



docs/configuration-reference.md (line 390)
<https://reviews.apache.org/r/41154/#comment169577>

    s/an/a



docs/configuration-reference.md (line 392)
<https://reviews.apache.org/r/41154/#comment169578>

    s/that tolerated/that will be tolerated



docs/configuration-reference.md (line 393)
<https://reviews.apache.org/r/41154/#comment169579>

    "An alternative to HTTP health checking. Specifies a shell command that will be executed.
Any non-zero exit status will be interpreted as a health check failure."



src/main/python/apache/aurora/client/config.py (lines 98 - 110)
<https://reviews.apache.org/r/41154/#comment169581>

    I feel like this validation, while absolutely important on the client, should also be
verified elsewhere... I don't think the Scheduler does anything other than pass this through
to the executor, so probably the executor should validate it when starting the task? My concern
is someone who does not use the client, but uses the API directly to create tasks would be
able to send invalid config and either have undefined health check behavior, no health check,
or just have a mysterious task failure.
    
    If we performed explicit validation in the executor when creating the health checker we
at least could exit with an explicit reason that makes it clear that the health check config
was invalid.



src/main/python/apache/aurora/executor/common/health_checker.py (line 247)
<https://reviews.apache.org/r/41154/#comment169583>

    This behavior is slightly changed from the current behavior, where if the user did not
opt in to health checks (by not binding a `health` port), we would return `None`.
    
    Now if they did not bind a `health` port, but type is `shell` with no command we'll return
an invalidly configured HealthChecker (It will have a ShellHealthCheck with no cmd set which
will eventually cause the executor to fail when health check is run). This ties in to the
need to validate in the executor that I mentioned above.
    
    Beyond that though, I've found there's been value in being able to configure a health
check, but then, e.g. not have the devel version of your task bind a `health` port, opting
out of health checking for those tasks. I'm not sure if there's a great way to replicate that
behavior for shell health checks (e.g. add an `enabled` field to `HealthCheckConfig`, seems
a bit gross) and if it's a useful enough "feature" to warrant replicating, but I figured I'd
bring it up for discussion regardless.



src/test/python/apache/aurora/common/BUILD (line 25)
<https://reviews.apache.org/r/41154/#comment169584>

    I don't think we need this dependency here. The tests under health_check will be picked
up when we run a recrusive target regardless.



src/test/python/apache/aurora/common/health_check/test_shell.py (line 39)
<https://reviews.apache.org/r/41154/#comment169585>

    Mind splitting these up into individual test cases?


- Joshua Cohen


On Dec. 11, 2015, 12:08 a.m., Dmitriy Shirchenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41154/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2015, 12:08 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko, Bill Farner, and Zameer Manji.
> 
> 
> Bugs: AURORA-1551
>     https://issues.apache.org/jira/browse/AURORA-1551
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Adding support for non-HTTP health checks.
> 
> 
> Diffs
> -----
> 
>   3rdparty/python/requirements.txt cfef18ee66b0f92d83c53dacb6d9376fc2e50445 
>   docs/configuration-reference.md 364292998bebb233d300fe59c9ea42b216deee81 
>   src/main/python/apache/aurora/client/config.py 2fc12559016d406c347adb416a5166cca31c961e

>   src/main/python/apache/aurora/common/BUILD 5fce3d0d29d2a38c6563b4d9be963532e595ee19

>   src/main/python/apache/aurora/common/health_check/__init__.py PRE-CREATION 
>   src/main/python/apache/aurora/common/health_check/shell.py PRE-CREATION 
>   src/main/python/apache/aurora/common/http_signaler.py a3193f3259276ec23d37f45839afe3c387cff6b1

>   src/main/python/apache/aurora/config/schema/base.py 398f737bed9ef02ce4a5636896d6587bce26501e

>   src/main/python/apache/aurora/executor/common/health_checker.py 03fdf0afef120c365c6ffad09e152780eed7e351

>   src/main/python/apache/aurora/executor/http_lifecycle.py 6d578cceb56375425ccac1cbfbbcd0add60f20e9

>   src/test/python/apache/aurora/client/BUILD 84c5c845d9b1c8078ae8b47242d0f2ffc00ef6dc

>   src/test/python/apache/aurora/client/test_config.py b1a3c1865819899ef19173be0f861783a2631d0a

>   src/test/python/apache/aurora/common/BUILD 2556c32842b3cf7040cb3c41172a0d9c365cb649

>   src/test/python/apache/aurora/common/health_check/BUILD PRE-CREATION 
>   src/test/python/apache/aurora/common/health_check/__init__.py PRE-CREATION 
>   src/test/python/apache/aurora/common/health_check/test_shell.py PRE-CREATION 
>   src/test/python/apache/aurora/common/test_http_signaler.py f68c71a6765f7f0b93c8c50662515b5742344f35

>   src/test/python/apache/aurora/executor/common/test_health_checker.py 27c71711d52f757ed1552db4accda671a6bdafdd

> 
> Diff: https://reviews.apache.org/r/41154/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests. 
> Ran e2e test.
> Tested expected behavior on virtual Mesos cluster.
> 
> 
> Thanks,
> 
> Dmitriy Shirchenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message