httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Re: cvs commit: httpd-2.0/server/mpm/threaded threaded.c
Date Tue, 03 Jul 2001 19:19:46 GMT

> > > After about another 10 minutes I started killing copies of b. Eventually I
> > > killed all of the bs and never saw any workers get idle cleaned up.
> >
> > How did you kill of b's?
> The old fashioned way - ctrl-c.

I must have missed something here.  I assumed you meant b was a child
process, but it must not be.  If you mean you killed your stress tool,
then this is what I would have expected.  I fixed this bug this morning.
Basically, yesterday's tree didn't clean things up properly.  Today's

> > > In addition, the error_log had a number of "long lost child came home!" after
> > > the first SIG_WINCH and after the second SIG_WINCH I found a number of
> > > "child pid XXXXX exit signal Segmentation fault (11)" (some of which resulted
> > > in "long lost child came home!" messages - apparently they came home in a
> > > body bag).
> >
> > The "long lost child came home" messages are an unfortunate side effect,
> > and an easy one to solve.  Basically, because we are replacing one child
> > with another before the parent's wait_for_child finds the one we replaced,
> > we lose that pid from the scoreboard, and we get this message.  The easy
> > solution, is to have a single dimensional array in the parent that keeps
> > track of all child processes created, and watches for them to die.  The
> > reason to separate this from the scoreboard, is that the parent doesn't
> > need this information in shared memory.  The other solution, is to move
> > the pid to the worker_score.  Either one works, but I am not likely to
> > implement either until after we tag and roll.  This is not a fatal error
> > IMHO.
> Do I misunderstand here? There are quiescing workers whose slot has been
> reused? If that is true, doesn't that mean that when the worker finishes
> its task and tries to update its slot with request info that it is
> overwriting the info for the currently active worker in that same slot?
> This seems problematic to me.

You misunderstand.  There are two portions to the scoreboard, parent and
worker scores.  The parent score is re-used before the process dies,
because all of the important information is in the worker score.  The
worker_score is not re-used.  What is causing the "long lost child"
messages, is that the parent searches the parent_score for the pid.  Since
that pid has been overwritten, it doesn't exist there anymore.  There is
no danger of overwritting, because the new child process doesn't use a
worker_score that is currently in use.

> > > Also, the first SIG_WINCH doubled the memory footprint of Apache from 3.5 MB
> > > (in the first hour it grew from a little under 2 MB to a about 3.5) to almost
> > > 8 MB. Immediately after the second SIG_WINCH the size grew to 12 MB then
> > > quickly fell back to 8 MB, then continued to grow at a slow rate until I killed
> >
> > This is obviously a memory leak someplace.  I don't know where right now,
> > and I am not going looking for it.  Again, I didn't allocate any memory in
> > my patch, I just changed how we interpret the information we have.
> I thought this was relatively well known. I was just pointing out that, to my
> knowledge, it has never been recommended to run the threaded mpm with MRPC=0
> for this reason. Someday these leaks need to be tracked down, if possible.

The chances are the memory leak will always be there.  Most thread
packages have leaks these days.  Until the thread packages get cleaned up,
there isn't much we can do.


Ryan Bloom               
Covalent Technologies

View raw message