httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject Re: fcntl() errors on Solaris
Date Wed, 13 Aug 1997 07:54:49 GMT
Has anyone looked at my jumbo patch that eliminates most of these problems
by moving away from file locking?

Dean

On Tue, 12 Aug 1997, Alexei Kosut wrote:

> I've been seeing this pretty consistently over the past few weeks: Since
> I'm use Apache mostly for development (of Apache and related bits), I
> start and stop Apache a lot; dozens of times an hour. After a five or six
> hours, I always manage to get it so Apache no longer works right; it only
> forks one or two children, instead of the five I have set, and I get tons
> of these in my error log:
> 
> [Tue Aug 12 17:41:29 1997] fcntl: F_SETLKW: No record locks available
> [Tue Aug 12 17:41:29 1997] - Error getting accept lock. Exiting!
> 
> I'm using Solaris 2.5.1, and I'm presuming this is because Apache uses
> fcntl for its accept_mutex to lock a file, and then exists (see my
> earlier message about how sig_term() works, and why it sucks) without
> unlocking the file. All those locked files add up, I guess.
> 
> If I leave it overnight, and come back, it works again for a while.
> 
> Oh, and the files are being locked over NFS. That possibly has something
> to do with it.
> 
> Still, Apache should make sure it unlocks its accept mutex before it
> quits. I think the children do the right thing wrt mutexes when they are
> restarted, though I'm not sure. I'm sure, however, that a shutdown
> (SIGTERM) does the wrong thing.
> 
> In fact, when the Apache parent gets a SIGTERM, it should do the following
> (IMHO) or something similar (instead of just killpg(SIGKILL) and exit):
> 
> 1. Set a shutdown_pending, like SIGHUP sets a restart_pending. By not
>    exiting directly, you let alarms and things work correctly.
> 2. standalone_main then checks shutdown_pending at the same time
>    it checks restart_pending.
> 3. Act similarly to a non-graceful restart: do a killpg(SIGTERM) (this
>    will shut down the children correctly, allowing child_exit to be
>    called and pool cleanups to be done.
> 4. call destroy_pool(), so any cleanups registered for the main server
>    are done (this might include stuff besides freeing memory -
>    disconnecting from a database or shutting down a compainion
>    process. Whatever.)
> 5. Maybe wait a few seconds, and do a killpg(SIGKILL), just to make
>    sure.
> 6. Now exit.
> 
> Yes, it takes a bit longer, but I think this is important to making sure
> Apache does the right thing. And as I've said, I will veto any release
> that includes a child_exit API phase that doesn't work (which includes
> the current 1.3a2-dev).
> 
> If the above sequence of events sounds good, I can make a patch. Or
> someone else can.
> 
> -- Alexei Kosut <akosut@organic.com>
> 
> 


Mime
View raw message