httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ralf S. Engelschall" <...@engelschall.com>
Subject Re: Apache 2.0 brokenness...
Date Sat, 22 Jan 2000 18:27:58 GMT

In article <3889D03A.BC318F98@algroup.co.uk> you wrote:

>> > If I run a single instance of the threaded version (by setting
>> > MaxClients to 1) it works. If I run multiple instances and debug the
>> > connection, it works. If I don't, I get connected, but the browser
>> > hangs. I'm not sure how to get a handle on debugging this! Any ideas,
>> > anyone?
>> >
>> > Platform is FreeBSD 3.2.
>> 
>> Does the problem go away if you use -DNO_SERIALIZED_ACCEPT, Ben?
> 
> Hmmm, yes it does.
> 
>> If
>> yes, then it's the mutex deadlock problem I mentioned a few months
>> ago, which occurs with all user space threading environments (e.g.
>> FreeBSD uthreads) because Apache 2.0 still uses flock/fcntl for the
>> inter-process accept mutex which usually does work only in kernel space
>> threading environments (e.g. LinuxThreads). If no, then its some new
>> problem.
> 
> I must've missed that - would you mind explaining again what the problem
> is?

Here it comes: The problem is that the threaded MPMs were originally
developed under Linux where a thread is implemented in kernel space.
Apache for various good reasons uses a mutex around accept() calls.
There are various variants how this mutex is implemented: flock,
fcntl, pthread_mutex, etc. The mutex has to be an inter-process mutex
per intention. For non-threaded MPMs this is no problem, because
there fcntl() or flock() is fine to call: they block the current
process. And in a threaded MPM with kernel space based threads (e.g.
LinuxThreads!) it still works (although it violates POSIX), there the
fcntl()/flock() calls block the current thread only (because it _IS_
actually a process). But now image what happens under any user space
(e.g. FreeBSD uthread or GNU Pth!) threading systems: there they block
not only the thread, they block the whole process and all of its threads
while it actually should only block a single thread. Bang! Some sort of
a deadlock occurs which can be freed again only by the next incoming
HTTP connection.

In short it runs this way under a user space threading environment:

1. say we have 2 childs: c1, c2
   say each child has only one initial thread: t1a, t2a
2. t1a and t2a enter the accept mutex, as a result
   c1 and c2 are both blocked until a HTTP connection arrives.
3. say c1 now gets a connection. the kernel awakes
   c1 and c2 spawns a request thread t1b which immediately starts
   running. It enters the I/O part and reads from the socket.
4. now say the sockets block a little bit and this
   way the user space scheduler switches in c2 from t1b to t1a. t1a
   again enters the accept mutex loop.
5. now because the mutex is a process-mutex, the
   thread t1a blocks the whole child c1 and this way also t1b. BINGO!
   the request processing in t1b hangs until child c1 again gets another
   request and the user space scheduler again switches to t1b.

Is the problem not more clear, Ben? 

The main problem itself is that using flock() or fcntl() in a threading
environment does _NEVER_ just block the current thread (although Apache
2.0 assumes this!). The behaviour of these functions do not change from
the standard POSIX semantics even in a MT environment. OTOH a standard
pthread_mutex cannot be used, too. Because it is not an inter-process
mutex. The only chance is a POSIX pthread_mutex inside a shared memory
segment (that's what POSIX defines for such situations). The problem
is just that most user space MT implementations do not support this
enhanced variant of pthread_mutex :-(

                                       Ralf S. Engelschall
                                       rse@engelschall.com
                                       www.engelschall.com

Mime
View raw message