aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Wickman" <wick...@apache.org>
Subject Re: Review Request 25974: Prevent uncaught exceptions from killing the executor.
Date Wed, 24 Sep 2014 01:44:48 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54378
-----------------------------------------------------------


I'm not sure a TASK_FAILED: KeyError: 'getpwuid() uid:0 not found' is the best user experience.
 Perhaps log.error the full traceback and do "TASK_FAILED: Internal error" or something like
that.

I still think this does not solve AURORA-728.  Perhaps add a reconnect loop and if it's not
connected within the initial health interval + (max consecutive failures * health check interval),
send a StatusResult of TASK_FAILED directly from the Announcer.

- Brian Wickman


On Sept. 24, 2014, 1:20 a.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25974/
> -----------------------------------------------------------
> 
> (Updated Sept. 24, 2014, 1:20 a.m.)
> 
> 
> Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
> 
> 
> Bugs: AURORA-728
>     https://issues.apache.org/jira/browse/AURORA-728
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Prevent uncaught exceptions from killing the executor when creating the status manager.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2

> 
> Diff: https://reviews.apache.org/r/25974/diff/
> 
> 
> Testing
> -------
> 
> ./pants src/test/python/apache/aurora/executor:executor-small
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message