httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <traw...@attglobal.net>
Subject Re: Failure in child_init when doing graceful with flock()
Date Mon, 15 Mar 2004 00:24:02 GMT
Aaron Bannert wrote:
> Ok, after wading through the code for awhile I have a working theory:
> 
> 1) Parent creats a child
> 2) Parent gets graceful-restart signal
> 3) Parent returns from ap_run_mpm, pconf is cleared, cross-process lock file
>    is closed and removed.
> 4) Child finally gets scheduled to run the apr_proc_mutex_child_init for
>    fcntl(). Oops, apr_file_open fails since step #3 above removed the file.
>    Child errors out (ENOENT is returned from apr_file_open()) and dies.
> 5) Parent notices that child has died, errors out and dies completely.

sounds very possible

hopefully it is sane if parent doesn't exit out if a prior generation child 
reports APEXIT_CHILDFATAL; but it looks like prefork checks for 
APEXIT_CHILDFATAL before checking if it is a current-generation child

> In any case, can anyone else confirm that this race condition exists, and
> maybe suggest a way to synchronize a parent's shutdown with the starting
> up of an old-generation child? (Eg. the parent shouldn't remove the
> lockfile until all children are successfully started.)

it shouldn't be bad to remove the lockfile when it is done now, and certainly 
that new child of old generation should exit ASAP anyway since it has old 
config; I suspect if parent ignores "fatal" exits of such children we'd be okay

no guesses from me on whether this race condition is what causes the problem


Mime
View raw message