aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zameer Manji <zma...@apache.org>
Subject Re: Review Request 53590: Change job updates to rely on `health-checks` rather than on `watch_secs`.
Date Wed, 16 Nov 2016 22:44:23 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53590/#review156131
-----------------------------------------------------------




src/main/python/apache/aurora/executor/aurora_executor.py 
<https://reviews.apache.org/r/53590/#comment226276>

    The only objection I have to this code is how we trigger `TASK_RUNNING`. The
    more I stare at this the more confusing I find it.
    
    It really seems like a hack to add `callback` to the status checker interface
    and rely on the fact that only the health checker implementation will call the
    call back.
    
    What if we get another status checker that may control if a task should
    transition to RUNNING or not?
    
    This also seems like an abstraction leak. The executor should be controlling the
    state of the task and not the status checkers.
    
    Have you considered another approach where we instead signal running through the
    `status` method of the status checker?
    
    Perhaps the return value can be changed from `None` for don't care, `RUNNING` to
    signal running right away, `STARTING` to block progress to `RUNNING`, and
    everything else to trigger failure?
    
    Then instead of deleting this code here, we just poll the status checkers to see
    if they all have consensus if we should enter RUNNING.
    
    So if they all return `None` we go to RUNNING. If all or some return RUNNING and
    the rest return `None` we go to RUNNING. If one returns STARTING we poll until
    we reach the previous states.
    
    What do you think about that?


- Zameer Manji


On Nov. 16, 2016, 1:38 p.m., Santhosh Kumar Shanmugham wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53590/
> -----------------------------------------------------------
> 
> (Updated Nov. 16, 2016, 1:38 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
>     https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Make RUNNING a first class state to indicate that the task is running
> and is healthy. It is achieved by introducing a new configuration
> parameter `min_consecutive_successes`, which will dictate when to move
> a task into RUNNING state.
> 
> With this change, it is possible to set the `watch_secs` to 0, so that
> updates are purely based on the task's health, rather than relying on
> watching the task to in RUNNING state for a pre-determined timeout.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 392496284b7f046a5dd29a5a3f175305c28a5afe 
>   docs/development/design-documents.md 6bfc679c841ef1b0861758ccc12a5150f1bdb5e3 
>   docs/features/job-updates.md 792f2ae5fd14b1ea9af8be000629ce5a7fc2fe8f 
>   docs/features/services.md b6bfc9de550a714b4243cf033e9f3f03ae676af2 
>   docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
>   docs/reference/task-lifecycle.md 4dcb48149e156ef1927e63f5a9fa6b53b174e755 
>   src/main/python/apache/aurora/client/api/updater_util.py c649316edb876565c92cc90c9f030e153c008924

>   src/main/python/apache/aurora/client/config.py 0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1

>   src/main/python/apache/aurora/config/schema/base.py 845163043b0b7b2f9e7aca14677ca9f094658551

>   src/main/python/apache/aurora/executor/aurora_executor.py aee5e56ae66ec7a6acbe97d94d6187fd8646ec9a

>   src/main/python/apache/aurora/executor/common/announcer.py 12d8e0af95d4c24c1b7a3b07d01b59cec4a15894

>   src/main/python/apache/aurora/executor/common/health_checker.py 3c7c09d173c4adbd89eddaee22cffcdd8d268378

>   src/main/python/apache/aurora/executor/common/resource_manager.py b7dc40d8973ec2e5998ab4f6ff988051a70bb1ab

>   src/main/python/apache/aurora/executor/common/status_checker.py 795dae2d6b661fc528d952c2315196d94127961f

>   src/test/python/apache/aurora/client/cli/test_inspect.py 7ef682d010660c9429d94d881dfd42d94f0ce5fd

>   src/test/python/apache/aurora/client/test_config.py 5cf68a5145ddf9478baa30453c0bcb73136fa7eb

>   src/test/python/apache/aurora/executor/common/test_announcer.py 58ca3a8d1ee248f050bab3d10d465c3260c15df3

>   src/test/python/apache/aurora/executor/common/test_health_checker.py da0c56ca084b65427a6122f22f251af8637772d1

>   src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
fe74bd1d36666ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 3f82165333665d127e1ace765a7dada7348a4b91

>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora c9dae28b83d751cfdc7c09dc2d83ac3ee4852720

>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora b85ace4bc33bda9483045732024302eb0725c89f

>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora b3caa4186217dbf17282a5780791ce3157f4389b

>   src/test/sh/org/apache/aurora/e2e/http_example.py 675ece85e2cf8e45f2c6e079e39b0e095075a3fa

>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 9cc6cec6633ee5973d8c9faa98ae24ab10cbeca4

> 
> Diff: https://reviews.apache.org/r/53590/diff/
> 
> 
> Testing
> -------
> 
> buils-support/jenkins/build.sh
> sh ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message