httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naik, Roshan" <>
Subject Bug: Apache hangs if script invokes fork/waitpid
Date Wed, 06 Oct 2004 18:01:23 GMT
The Problem:
I notice Apache 2(worker mpm) is not able to correctly handle 
a fork/waitpid invoked by a script used with mod_perl.

Here is a simple cgi perl script to reproduce the problem (run under

print "Content-Type: text/plain; charset=euc-jp\n\n";

if( $pid == 0 ) {
else {

Run this cgi perl script under mod_perl (with ExecCGI option).
The call to fork in the perl script actually creates a child process
that is 
identical to the Apache process currently handling the request. This
I  believe, is somewhat different if the cgi script was run under

The forked child is not exactly identical to its parent 
since the forked process only has the worker thread and no other threads
(i.e listner or the main thread waiting on POD or the other workers). 

Now the funny thing is that once the forked process completes executing
remainder of the Perl script it returns back to the worker thread who
goes back to ap_queue_pop waiting for someone to feed it more requests
handle...instead of terminating. And there is no one to feed it
Now the parent process (or perl script that invoked fork()) 
is performing a waitpid() for this child and thus blocked forever. 

This effectively has made one worker thread useless. Invoking a second
request will
cause another worker thread to go down the drain and you can continue to
do this 
till all worker process are put into a never ending wait. If Apache has
been constrained 
by MaxClients to some value N, then N requests will efectively cause
Apache to 
stop responding  to further requests.

Essentially the forked Apache process does not know that it is not a
real Apache 
worker process.

Fixing it:

First Solution :
The natural fix for this is to somehow make the forked worker aware that
it is not a real 
worker and thus it should not go back to waiting for more requests once
its done. So 
essentally in the worker_thread function we check if ap_my_pid ==
getpid() before continuing 
into the next iteration of the while loop around ap_queue_pop().
ap_my_pid actually has the 
pid of the (real) worker that forked the child worker. 

  static void * APR_THREAD_FUNC worker_thread(apr_thread_t *thd, void *
// ... snip ...

   while (!workers_may_exit) 

// ... snip ...


            if (workers_may_exit) {


            /*break from loop if this worker was forked by another
            if ( ap_my_pid != getpid() ) { 
                  break;  // or apr_thread_exit(..) or exit(NULL);

            rv = ap_queue_pop(worker_queue, &csd, &ptrans); 
// ... snip ...

   } // end while

// ... snip ..


However there is a problem with this approach. The connection is closed
(by the forked worker) 
even before the next iteration of while loop. This causes a problem for
the parent who is 
blocked on waitpid(). Consequently when the child returns control back
to the parent the parent 
can no longer talk to the client. So the core_output_filter of the
parent prints out an error 
message saying "broken pipe" in the error log.

It seems that the connection is closed by the worker in the apr_sendv()
function call itself ! 
So once the child has completed sending all its output, the connection
is closed immediately. 

I thought it might be a better idea to invoke apr_thread_exit() instead
of break to prohibit 
the forked worker from cleaning up any of the data structures that
actually belong to its 
parent. But apr_thread_exit calls this..

  apr_pool_destroy(thd->pool);  // this cleans up stuff that belongs to
the parent!!

Second solution: 
  Is to invoke exit(0) (if we are in the forked worker) just after
ap_run_handler is invoked 
by ap_invoke_handler....

    AP_CORE_DECLARE(int) ap_invoke_handler(request_rec *r)
// ...snip....

  result = ap_run_handler(r); 

  if ( I am a forked worker ) { 
      exit(0);         // terminate at the earliest possible stage after
request was processed

// ...snip....

Unfortunately this leaves us with a different problem (although a
smaller problem)
What happens here is that the forked child did all its request
processing, but never is not 
given a chance to send any data out back to client(if it has any). But
then the catch-22 is 
that if we allow it to send all its data out..then it will close the
connection too. 

In summary:
Second solution seems preferrable among the two I have suggested.

If I had to chose between disallowing the parent or child from sending
data back, I would 
disallow the child. Either way, parent is allowed to write stuff back
until the child closes 
the connection. From my point of view, as forking is more useful for
performing time consuming 
background tasks, rather than performing concurrent wirtes back to the
client, it seems preferrable 
to disallow the child.

Ofcourse, if there is a way to allow both the child and the parent to
both write back then that's 
best. Then we will have to leave it to the script writers to decide how
the parent and child will 
synchronize between themselves in order to avoid garbled output going to
the client.


View raw message