httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Richards <p.richa...@elsevier.co.uk>
Subject Re: SIGUSR1
Date Thu, 21 Nov 1996 16:39:20 GMT
Ben Laurie <ben@gonzo.ben.algroup.co.uk> writes:

Did you really need to include my whole message? It was quite long!

> > I suspect that by this point the parent is seriously confused since it
> > thinks it only has "daemons_to_start" children when it fact it has 
> > "daemons_to_start * generations" children. I didn't follow this
> > through to the scoreboard so I've no idea what that looks like in
> > these circumstances.
> 
> The parent is not confused. It is just no longer interested in the "old"
> children. They'll die on their own, eventually. Its the "eventually" that is
> the real source of the problem, I think.

I think this is a design error. I'd prefer the parent to keep track of
processes it spawns and not "trust to luck". If it's a quiet site and
quite a few SIGUSR1's are done, which is possible if some configuration
changes are being made, then you'll end up with an awful lot of
children that may never go away (or at least hang around a long time).

If the number of child processes started is large you could trash the
system with a SIGUSR1, I didn't realise this was how graceful restart
was implemented, it's wrong.

The parent should kill off any children that aren't in use when it
receives a SIGUSR1 and only replace the one's it kills.

The children should ignore SIGUSR1.

> > Now, I'm not sure exactly what's going on but the line
> > 
> > 1823     sd=saved_sd;   
> > 
> > is executed during a SIGUSR1 and I'm betting that this bit of code is
> > what is screwing up the parent restart on a subsequent SIGHUP since
> > the error suggests that the parent isn't releasing the socket (it's
> > the only process still running at that point after a HUP).
> 
> Ah. That's a point. Possibly. Certainly it needs to do sd=saved_sd, (though
> possibly not if the port has changed, bug?) but it may need to take some action
> when subsequently SIGHUPed.
> 
> > 
> > All in all, this looks very hacked together and unless someone has a
> > quick fix we should disable it and take a fresh look at the whole
> > forking model for 2.0, which we'd have to do anyway for threading.
> 
> I see no mileage in disabling it - if users don't like it they can not use it.
> It works fine if SIGUSR1 and SIGHUP are not mixed.

I don't think so, you can trash your system by mistakenly sending a
child the signal instead of the parent and I don't think the behaviour when
the parent is sent SIGUSR1 is entirely safe either for big sites where
they spawn a large number of children or very small sites (like your
development box) where you have very few requests but you reconfigure
the server a lot.

I don't think the mixing of SIGUSR1 and SIGHUP is actually the main
concern, there's some bug with clearing the parent's socket but I think
we could probably fix that during the beta cycle. What I'm really
concerned about is the implementation of graceful restarts in general
because of the above behaviour.

-- 
  Paul Richards. Originative Solutions Ltd.  (Netcraft Ltd. contractor)
  Elsevier Science TIS online journal project.
  Email: p.richards@elsevier.co.uk
  Phone: 0370 462071 (Mobile), +44 (0)1865 843155

Mime
View raw message