aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Moses Nakamura" <nny...@gmail.com>
Subject Re: Review Request 31104: task-executor: TASK_RUNNING after first health check
Date Wed, 18 Feb 2015 04:36:56 GMT


> On Feb. 18, 2015, 1:40 a.m., Zameer Manji wrote:
> > Will this change break update configs because some time values in the UpdateConfig
are timeouts until a task enters the RUNNING state?
> 
> Bill Farner wrote:
>     It very likely will, communication will be imperative with this change.  Moses -
can you link this against the ticket and/or discussion thread with background?

If I understand your question correctly, yes, that is the intent.  If you have health checks,
passing a health check will short-circuit the "OK, we can move on to the next batch" step.
 If you don't have health checks, it should continue to operate as it has already done, which
is to say that if it will be marked successful unless it gets marked as failed by the executor
(I think for non-health checking services this means the process died).


- Moses


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31104/#review72868
-----------------------------------------------------------


On Feb. 18, 2015, 4:32 a.m., Moses Nakamura wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31104/
> -----------------------------------------------------------
> 
> (Updated Feb. 18, 2015, 4:32 a.m.)
> 
> 
> Review request for Aurora and Brian Wickman.
> 
> 
> Bugs: AURORA-894
>     https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This is the first step in changing TASK_RUNNING to mean that the application is alive
and responding to health checks (if the task is configured to support health checks).  This
review is just to get feedback, I can't do this review in parts because the scheduler must
be changed in lockstep with the executor, or everything will break.
> 
> I don't know if this is the right approach, could you give me some high level advice?
 I'm also not sure who to add to this review.
> 
> Here is the high level description that we came up with:
> 
> http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201412.mbox/%3CCAOTkfX4KTUpMVcjeFf5%3DvvGXb91to5baNSzvyiwtk-sTddxGXQ%40mail.gmail.com%3E
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 9c0282392dbb9cca308baf47adc1750c1f5cacc6

>   src/main/python/apache/aurora/executor/common/announcer.py dda76f018f472d7d8228459eb89f4c5daf9df26d

>   src/main/python/apache/aurora/executor/common/health_checker.py 60676ba0fbd8a218fe4309f07de28e2c66d54530

>   src/main/python/apache/aurora/executor/common/resource_manager.py 08e02e41b581f275f070228bb23c4cf2a0489f9a

>   src/main/python/apache/aurora/executor/common/status_checker.py 624921d68199df098ea51ee8a10815403bf58984

>   src/test/python/apache/aurora/executor/common/test_announcer.py 6b782778e52394de3744b43003226dac3f65169e

>   src/test/python/apache/aurora/executor/common/test_health_checker.py def249c2509a28f7145380f250f79202b653dc83

>   src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
8f288f6115ab52265dfaffffda3f41d81271c55a 
> 
> Diff: https://reviews.apache.org/r/31104/diff/
> 
> 
> Testing
> -------
> 
> This hangs after I call is_health_checks_enabled, and I don't know why.  My suspicion
is that I'm throwing an exception and cratering the task executor, but I don't know how to
tell.  How do I get it to print?  I'm running it with:
> 
> ./pants test src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Moses Nakamura
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message