perl-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Foertsch <torsten.foert...@gmx.net>
Subject COND_WAIT in modperl_tipool
Date Wed, 28 Mar 2007 19:06:59 GMT
Hi,

is there a reason not to use APR mutex and condition variables in the tipool 
implementation but instead Perl's COND_... and MUTEX_... macros?

The reason I am asking is that after some heavy load generated with ab over a 
GBit link I have seen request timeouts and what is worse apache processes 
hanging around not accepting connections. Then I patched mod_perl/mod_status 
to report the waiting in COND_WAIT via the scoreboard and saw there were some 
requests hanging in this state after all traffic has been finished.

The really bad thing about that is that the hanging process is alive for the 
master apache. So, it's share of the scoreboard is blocked and MaxClients is 
practically lowered by ThreadsPerChild.

I have then changed the tipool implementation to use apr_thread_cond_timedwait 
with a timeout of 0.5 sec instead of COND_WAIT. The problem seems gone.

I know the problem lies probably not in mod_perl but in the pthread lib on my 
Linux (NPTL 2.5). But maybe it is saver programming not to wait forever. The 
more as it is not determined in what order threads are awakened by a 
pthread_cond_signal.

Normally when a worker apache waits for requests it has one thread waiting in 
ap_mpm_pod_check, one in apr_pollset_poll and a lot of other threads waiting 
somewhere else (see thread 3 below). When a process hangs in the state I have 
described the first 2 threads are missing. I guess it has reached its 
MaxRequestsPerChild and is about to exit but cannot finish all threads.

(gdb) btt 1
[Switching to thread 1 (Thread -1214036288 (LWP 17042))]#0  0xb7f08410 in ?? 
()
#0  0xb7f08410 in ?? ()
#1  0xbfc8beb8 in ?? ()
#2  0x00000001 in ?? ()
#3  0xbfc8beb3 in ?? ()
#4  0xb7d2302b in __read_nocancel () from /lib/libpthread.so.0
#5  0x08092d59 in ap_mpm_pod_check (pod=0x8717ad8) at pod.c:54
#6  0x08090748 in child_main (child_num_arg=0) at worker.c:1258
#7  0x080908f4 in make_child (s=0x80b7f48, slot=0) at worker.c:1341
#8  0x08090e08 in perform_idle_server_maintenance () at worker.c:1543
#9  0x08091038 in server_main_loop (remaining_children_to_start=0) at 
worker.c:1646
#10 0x0809139e in ap_mpm_run (_pconf=0x80b60a8, plog=0x80e4160, s=0x80b7f48)
    at worker.c:1748
#11 0x08062b7e in main (argc=3, argv=0xbfc8c294) at main.c:717
(gdb) btt 2
[Switching to thread 2 (Thread 1388235664 (LWP 17244))]#0  0xb7f08410 in ?? ()
#0  0xb7f08410 in ?? ()
#1  0x52bec318 in ?? ()
#2  0x00000004 in ?? ()
#3  0x08f81ea0 in ?? ()
#4  0xb7ca49f6 in __epoll_wait_nocancel () from /lib/libc.so.6
#5  0xb7d89fc4 in apr_pollset_poll (pollset=0x8f81e68, timeout=-1, 
num=0x52bec35c,
    descriptors=0x52bec358) at poll/unix/epoll.c:239
#6  0x0808f633 in listener_thread (thd=0x8f289e8, dummy=0x8bf7fa0) at 
worker.c:687
#7  0xb7d8dbc8 in dummy_worker (opaque=0x8f289e8) at 
threadproc/unix/thread.c:138
#8  0xb7d1c112 in start_thread () from /lib/libpthread.so.0
#9  0xb7ca42ee in clone () from /lib/libc.so.6
(gdb) btt 3
[Switching to thread 3 (Thread 1396628368 (LWP 17243))]#0  0xb7f08410 in ?? ()
#0  0xb7f08410 in ?? ()
#1  0x533ed318 in ?? ()
#2  0x000000c8 in ?? ()
#3  0x00000000 in ?? ()

Torsten

Mime
View raw message