httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Bloom <>
Subject Re: fcntl hanging without anyone holding the lock?
Date Mon, 26 Apr 1999 13:10:57 GMT
> Then I kill -TERM process 1. It goes away, but process 2, thread 1
> never wakes up out of fcntl. The source for the test is attached.
> This sounds like a bug in either the kernel or libc (is it?), but the
> hybrid server is broken here too.

I have just executed your code on an AIX system, and it displays the same
behavior you are seeing on Linux.  This doesn't mean it isn't a
kernel/libc bug, but in my opinion it makes the chances of that a bit less
likely.  I would think that this bug would have been reported and/or fixed
in one of the two systems.

> When a child gets a SIGTERM, it immediately runs the module cleanups,
> destroys the pchild pool, and exits. The problem is that our worker
> threads are still operating during these cleanup steps, so we could
> get all sorts of nasty corruption.
> So, we need to make sure the worker threads aren't doing stuff while
> clean_child_exit is running. I have a few ideas for this:
> 1. pthread_cancel all the threads before doing the other things in
> clean_child_exit. The problem with this is that we might have
> third-party libraries that aren't cancellation-safe. I imagine it's
> even harder to find cancellation-safe libraries than thread-safe or
> signal-safe ones. We could disable and enable cancellation like alarms
> are blocked and unblocked in 1.3, but this is a lot to do for
> something we've acknowledged is evil.
> 2. Wait for the threads to finish their current requests before
> exiting. This is a trivial change, and basically means sending
> SIGWINCH to the children instead of SIGTERM. The disadvantage of this
> is that if we want to kill the server while its processing long-lived
> requests, server shutdown will take a while.
> 3. Same as #2, but have each thread check for the exit_now flag many
> times during request processing. This is just plain nasty. However,
> 1.3 has {block,unblock}_alarms scattered throughout the code, so the
> situation won't be any worse.
> I'm sure there's another solution involving sending signals to all the
> threads and essentially using the 1.3 scheme for managing them, but
> this is nasty too.
> I don't like any of these solutions, but I'm leaning towards #3. Any
> better ideas out there?

Why not just check to see if a shutdown has been signaled before grabbing
any of the locks?  This does not present a race condition, if the thread
that receives the signal first sets shutdown_signaled, and then releases
all of it's locks.  Any thread that then tries to grab the lock will see
that it shouldn't bother trying to grab the lock, and should die on it's
own.  This removes any need to asynch cancelation, and it only adds a few
lines of code and almost no complexity.  It works for any kind of
cancelation, because with the select/accept model, no thread can accept
more work once ANY kind of shutdown has been signaled to the child.  This
should remove the problem completely unless I am missing something.


Ryan Bloom
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	

View raw message