httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Stoddard" <b...@wstoddard.com>
Subject Re: No processes left after big AB test
Date Thu, 12 Apr 2001 14:16:47 GMT
> Greg Ames wrote:
> >
> > "Paul J. Reder" wrote:
> >
> > > By the way, when I start Apache then run ps -efH (with no server load) I get
something like
> > > webadmin 21803     1  0 10:07 pts/3    00:00:00   httpd -d /home/webadmin/Apache
  (1 top
level Apache)
> > > webadmin 21805 21803  0 10:07 pts/3    00:00:00     httpd -d /home/webadmin/Apac
(Start_Server number of these)
> > > webadmin 21808 21805  0 10:07 pts/3    00:00:00       httpd -d /home/webadmin/Ap
  (1 per
Start_server)
> > > webadmin 21809 21808  0 10:07 pts/3    00:00:00         httpd -d /home/webadmin/
(threads_per_Child number of these
> > > webadmin 21812 21808  0 10:07 pts/3    00:00:00         httpd -d
n/              -
> > > webadmin 21815 21808  0 10:07 pts/3    00:00:00         httpd -d
/               -
> > > webadmin 21818 21808  0 10:07 pts/3    00:00:00         httpd -d
                -
> > >
> > > I understand 21803 and I understand 21809 and its ilk. I also understand either
21805 or 21808
> > > but not both. What am I missing in the way that processes and threads are handled
in
APR/threaded mpm?
> > >
> >
> > I hacked up apr/test/testthread.c so that the threads sleep for several
> > seconds before exiting.  This program creates 4 threads via
> > apr_thread_create.  When all 4 are sleeping, I see:
> >
> > [gregames@gandalf httpd-2.0]$ ps ax -HO ppid,wchan | grep testthread
> >  3306  1152 rt_sig S pts/0    00:00:00       ./testthread
> >  3307  3306 do_pol S pts/0    00:00:00         ./testthread
> >  3308  3307 nanosl S pts/0    00:00:00           ./testthread
> >  3309  3307 nanosl S pts/0    00:00:00           ./testthread
> >  3310  3307 nanosl S pts/0    00:00:00           ./testthread
> >  3311  3307 nanosl S pts/0    00:00:00           ./testthread
> >
> > ...so it looks like Linux is creating an extra thread/process for us
> > (3307), probably when we do first pthread_create.
> >
> > Greg
>
> And according to my tests it is 3307 that is exiting / becoming defunct and leaving the
> worker threads orphaned. No core file, no log, no nothing...
>

I've not looked at Linux docs but I would expect that 3306 is the pid, 3307 is the main thread
and
3308 - 3311 are the 4 threads created with apr_thread_create.  Paul, did you make the modification
to child_main to
1. Eliminate the call to apr_create_signal_thread()
2. Have the main thread not call worker_main() and only exit when all the worker threads have
exited?

If so, then perhaps the main thread is still exiting before the worker threads due to a race
condition between when worker_thread_count is decremented and when the thread actually exits.
In
other words, a worker may decrement worker_thread_count, then be suspended, the next worker
decrements worker_thread_count, then suspended, etc. Then the main thread wakes up, sees that
all
the workers "have exited" and exits. The problem is that the worker threads have NOT really
exited
yet, they've only decremented worker_thread_count.  Windows has a nice signalling mechanism
that the
main thread can use to guarantee that the workers have exited.  I suspect Unix has something
similar.

Bill



Mime
View raw message