tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 44454] - busy count reported in mod_jk inflated, causes incorrect balancing
Date Wed, 20 Feb 2008 14:18:26 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=44454>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=44454





------- Additional Comments From rainer.jung@kippdata.de  2008-02-20 06:18 -------
How many threads per process do you use in httpd (MPM configuration)? 60?

Anyways, I adopted my JVM Stack analysis script to gstack. The results (sums
over all 19 processes=1178 Threads) are following. In total I can see 68 threads
that would count as busy in the lb (and 2 threads that seem to do direct ajp
work without lb). So how do these numbers compare to your observed busyness at
the time of dumps?

Concerning your configuration:

Pessimistic locking for me is more a relic from the early days of the code. I
don't know about any cases, ahere it is actually needed. In case you had opened
this case with optimistic locking (default) when running out of ideas, I woulkd
have asked you to try pessimistic locking, but your case proves once more, that
it's not really useful. I would stick to the default (optimistic) in order to
run the code path that#s used most often.

You don't have a reply_timeout. The stack seem to indicate, that hanging replies
might well be your problem. Unfortunately in 1.2.25 max_reply_timeouts doesn't
work, so any reply_timeout that fires will put your worker into error state
until recovery, so other users will loose sessions when being switched over to
failover workers.

In 1.2.26 we fixed max_reply_timeouts, so that you can configure a reply_timeout
(don't wait longer for an answer than X seconds) and the number of
reply_timeouts that you expect before you think the backend itself is in serious
trouble and should be put out of service. So maybe here's the argument for a
switch to 1.2.26, although I hadn't expected that situation last night, when I
wrote about 1.2.26.

In 1.2.27 (march) we will be able to configure a reply_timeout not only per
worker, but instead per mapping, so that you can give individual JkMounts
individual reply_timeouts. So if some use cases are known to take a long time,
you can have an appropriately short general timeout, with some longer timeouts
for the URLs known to take notoriously long. It already works in 1.2.27-dev, but
there's no release out there.

Have a look at the following data yourself. I think the most reasonable approach
is to use a reply_timeout with max_reply-timeouts, which in turn will force you
to update to 1.2.26.

The details of the dumps:

548 Threads idle waiting (I would say in Keepalive,
    so hanging on a connection from the client,
    but waiting for the next request):
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_recv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in apr_bucket_socket_create () from /usr/lib/libaprutil-1.so.0
#5  0xHEXADDR in apr_brigade_split_line () from /usr/lib/libaprutil-1.so.0
#6  0xHEXADDR in ap_core_input_filter () from /proc/PID/exe
#7  0xHEXADDR in ap_get_brigade () from /proc/PID/exe
#8  0xHEXADDR in ap_rgetline_core () from /proc/PID/exe
#9  0xHEXADDR in ap_read_request () from /proc/PID/exe
#10 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#11 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#12 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#14 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#15 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#16 0xHEXADDR in clone () from /lib/libc.so.6

272 Threads idle
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0xHEXADDR in apr_thread_cond_wait () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in ap_queue_pop () from /proc/PID/exe
#4  0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#5  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#6  0xHEXADDR in start_thread () from /lib/libpthread.so.0
#7  0xHEXADDR in clone () from /lib/libc.so.6

235 Threads lingering close (client connection shutdown)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_recv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_lingering_close () from /proc/PID/exe
#5  0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#6  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#7  0xHEXADDR in start_thread () from /lib/libpthread.so.0
#8  0xHEXADDR in clone () from /lib/libc.so.6

59 Threads in mod_jk, waiting for reply from the backend (with lb?)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in read () from /lib/libpthread.so.0
#2  0xHEXADDR in jk_tcp_socket_recvfull () from /etc/httpd/modules/mod_jk.so
#3  0xHEXADDR in ajp_connection_tcp_get_message ()
#4  0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#5  0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#6  0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#7  0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#8  0xHEXADDR in ap_run_handler () from /proc/PID/exe
#9  0xHEXADDR in ap_invoke_handler () from /proc/PID/exe
#10 0xHEXADDR in ap_process_request () from /proc/PID/exe
#11 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#14 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#15 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#16 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#17 0xHEXADDR in clone () from /lib/libc.so.6

36 Threads internal tasks (non JK, non-worker threads)

11 Threads sending non-JK content back to browser
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_chunk_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ?? () from /etc/httpd/modules/mod_deflate.so
#14 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#15 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#16 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#17 0xHEXADDR in ap_note_auth_failure () from /proc/PID/exe
#18 0xHEXADDR in ap_process_request () from /proc/PID/exe
#19 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe

6 Threads sending back jk-replies to browser
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_chunk_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ap_filter_flush () from /proc/PID/exe
#14 0xHEXADDR in apr_brigade_write () from /usr/lib/libaprutil-1.so.0
#15 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#16 0xHEXADDR in ap_rwrite () from /proc/PID/exe
#17 0xHEXADDR in ws_write () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so

2 Threads waiting in mod_jk for response from backend (without lb?)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in read () from /lib/libpthread.so.0
#2  0xHEXADDR in jk_tcp_socket_recvfull () from /etc/httpd/modules/mod_jk.so
#3  0xHEXADDR in ajp_connection_tcp_get_message ()
#4  0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#5  0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#6  0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#7  0xHEXADDR in ap_run_handler () from /proc/PID/exe
#8  0xHEXADDR in ap_invoke_handler () from /proc/PID/exe
#9  0xHEXADDR in ap_process_request () from /proc/PID/exe
#10 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#11 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#12 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#14 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#15 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#16 0xHEXADDR in clone () from /lib/libc.so.6

2 Threads writing back jk-replies to client (with chunking)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_filter_flush () from /proc/PID/exe
#12 0xHEXADDR in apr_brigade_write () from /usr/lib/libaprutil-1.so.0
#13 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#14 0xHEXADDR in ap_rwrite () from /proc/PID/exe
#15 0xHEXADDR in ws_write () from /etc/httpd/modules/mod_jk.so
#16 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#17 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so

1 Thread flushing to client in jk reply
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ap_rflush () from /proc/PID/exe
#14 0xHEXADDR in ws_flush () from /etc/httpd/modules/mod_jk.so
#15 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#16 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#17 0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in ap_run_handler () from /proc/PID/exe

5 Threads other states, non-jk related


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message