httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thom Park" <Thom.P...@borland.com>
Subject [users@httpd] odd hang using worker mpm
Date Thu, 18 Sep 2003 15:15:16 GMT

Hello,

I have an unfortunate scenario with apache 2.0.47 compiled
--with-mpm=worker where I end up with a non-responsive apache. There are
only two processes left, the master and a child. Taking a pstack of both
of these processes I notice that neither of them are in an 'accept wait'
state which leads me to believe that the mechanism (where there's a
listener thread that is 'accepting' connections and there are worker
threads waiting for work) for handling job control is broken.

I had a look a worker.c and is seemed to me that a listener thread in
each process would attempt to get the process-level accept_mutex to
allow it to accept a connection, and hand this connection off to a
worker thread. It would then release the accept_mutex, thereby allowing
another process (or itself) to become the 'acceptor' of any new work.

Based on this I'd expect to see at least one 'listener' thread in any
child process (blocked in an accept state) I was surprised to see, in my
case, that there was none - at least I can explain my non-responsive
server.

However, I couldn't understand why the master process wouldn't simply
spawn a new process as the other one was doing nothing.

I then noticed this define in the code:

#define SAFE_ACCEPT(stmt) (ap_listeners->next ? (stmt) : APR_SUCCESS)

The SAFE_ACCEPT macro wraps the accept_mutex lock and unlock operations.

For the correct working of this mechanism, it seems to me, that there
needs to be at least two processes (or threads?) capable of 'listening'
for  connections. In my scenario, I have only one child, with no threads
listening.

So -here's the bit I don't understand:

Is it possible for a process/thread to crash and not 'unlock' the
accept_mutex. It seem to me, that a process could go grab the lock and
move into the listnening/accept phase when there's more than one thread
capable of accepting (lp_Listeners->next != NULL ) and, while it's busy
accepting, another process could die and leave only one possible
'listener' i.e. ap_listeners->next == NULL), now, when the acceptor
tries to unlock the mutex, the SAFE_ACCEPT condition is no longer true
so the lock never get's released and  since the current 'acceptor'
thread is nolonger accepting, we have a 'live-lock' situation

Now - I'm pretty sure I've misunderstood this code - as this is pretty
complex code, but can any out there with a better feel for this comment
on what, in their opinion, could cause this sort of hang scenario?

-Thom

p.s there is evidence of apache processes bus-erroring but I would have
thought the signal handling code would take care of clearing out
deadprocesses resources etc.



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message