httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul J. Reder" <>
Subject Re: Suggested direction for fixing threaded mpm thread startup.
Date Mon, 23 Apr 2001 14:06:57 GMT wrote:
> > > > It doesn't affect any operator initiated shutdown/restart commands, only
> > > > internal algorithmically generated things like
> > > > perform_idle_server_maintenance and MaxRequestsPerChild.
> > >
> > > Then it doesn't solve the problem that Paul was talking about.
> >
> > AFAIK, the specific problem that Paul had when he started this thread
> > was triggered entirely by MaxRequestsPerChild.
> The specific problem that Paul had when this thread was started, was that
> the server could hang in the case that all processes had a single thread
> serving a long-lived request during a graceful restart.

Actually, it had nothing to do with graceful anything, it happened while I
was pounding the crap out of the threaded server for long periods of time. At
certain points of planetary alignment many of the threads would reach
max_requests_per_child (mrpc) and would exit. Servers would often be left
with a very small number of threads handling long responses, thus unable to 
exit and be replaced. Apache would end up with no threads to handle new requests.
Unfortunately, in this universe, planetary alignment happened frequently ;)

> > > general, I don't understand why this change is necessary.  If we solve the
> > > problem of operator initiated shutdown/restart, then any problem with
> > > perform_idle_server_maintenance and MaxRequestsPerChild should fall out in
> > > the wash.
> >
> > uhhh, no.  Read some of the posts in this thread more carefully, please.
> I have read them all in great detail.  If we solve the problem of the
> operator initiated restart, then MaxRequestsPerChild and
> perform_idle_server_maintenance will just end up working.  This is because
> the problem occurs when a lot of child processes are trying to exit at
> once, but that can't because one of the threads is serving a long-lived
> request.  this situation is actually producable with a graceful restart.

Gentlemen, put those sharp objects down. You are both right, and wrong.
The fact is that this can happen during graceful restart/shutdown, but timing it
right can be difficult. I can get it to happen easily during abuse testing
of Apache.

Although limiting the number of exiting processes/threads can solve the
problem, I don't think it is the right long term solution. We need to let
them exit when they are configured by the admin to do so. If the admin
wants them to exit after 1000 requests then let all of them that meet
that criteria go. The problem is in creating new ones to replace the
ones that are leaving (when necessary).

Slowing down the exit process could only work if we allowed the threads
to serve more than mrpc number of requests, otherwise we are just leaving
threads around that won't be doing anything useful anyway. Besides, as Greg
found out, race conditions are VERY important with this code. The reason
that the processes were being left around (desk check only) is that two
processes tried to go away at the same time, one overwriting the flag of
the other. The kids went away but the process wasn't killed because it
didn't own the flag.

Just testing with graceful X won't clearly show the problem of not creating
new threads to fill the void of exiting threads, which is the real problem.

I am working to fix the signal stuff in prefork and the threaded mpm. Then I
would like to work (with any volunteers?) on splitting the scoreboard. I think
splitting the scoreboard and redesigning the replacement algorithm is the
right answer.

Paul J. Reder
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein

View raw message