aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hussein Elgridly <huss...@broadinstitute.org>
Subject How dead is a job after aurora job kill?
Date Tue, 14 Apr 2015 22:18:13 GMT
Assuming I tell Aurora to kill a job, either through aurora job kill or the
Thrift API, and it returns with a success:

1. Is it guaranteed that the process on the slave node is no longer running
by the time the command returns?
2. If not, will doing a subsequent aurora job status return a non-KILLED
status to reflect this? (Even LOST is fine.)
3. Do Thermos finalizers run when a job is killed by user?

I'm thinking about possible weird failure modes where e.g. Mesos loses
connection to the slave and it keeps on truckin'. The particular case I'm
worrying about is a job continuing to run and surprising us by writing
files when we thought it was dead.

Thanks,
Hussein Elgridly
Senior Software Engineer, DSDE
The Broad Institute of MIT and Harvard

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message