tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Boyce <Chris.Bo...@cdw.com>
Subject RE: Abandoned apache children with mod_jk
Date Sun, 23 Jun 2013 14:53:51 GMT
Thanks again, I will try to investigate the tomcat side if I see another one.   I have tried
compiling the patch for bug 49504.  I've yet to see any workers stranded that look like example
1 below.  I'll keep an eye on it.  Thanks again for your help.

-Chris





-----Original Message-----
From: Rainer Jung [mailto:rainer.jung@kippdata.de] 
Sent: Sunday, June 23, 2013 3:24 AM
To: users@tomcat.apache.org
Subject: Re: Abandoned apache children with mod_jk

On 22.06.2013 23:18, Chris Boyce wrote:
> Thank you so much for the reply.  Here are a couple of examples, as I'm not completely
sure if my symptoms match, though the pstacks do look very similar to my untrained eye:
> 
> 
> Here is a two day-old child:
> 
> 27743:  /usr/local/apache2/bin/httpd -k start
> -----------------  lwp# 1 / thread# 1  --------------------  ff00a42c 
> lwp_wait (3, ffbff804)
>  ff001e88 _thrp_join (3, 0, ffbff86c, 1, ff0b2780, ffbff804) + 38
>  ff214544 apr_thread_join (ffbff8ec, 32eea8, 7, 0, dc328, b15e0) + c  
> 0008c43c join_workers (0, fe3aa8, 8bfcc, 32ec30, 0, 1) + ec
>  0008c790 child_main (2, 8b31c, 0, feee2a40, ff0b2840, ff0b2780) + 270
>  0008c970 make_child (c7800, 2, 0, c8800, c7000, c8400) + 128
>  0008d1b4 ap_mpm_run (fe4100f8, e, 0, 1, 27, 1) + 754
>  000343c0 main     (d6218, d8190, ffbffc54, c7800, c7800, 0) + 79c
>  00033754 _start   (0, 0, 0, 0, 0, 0) + 5c
> -----------------  lwp# 3 / thread# 3  --------------------
>  ff0058d4 lwp_park (0, 0, 0)
>  fefff6e8 cond_wait_queue (32ecc8, 32ec98, 0, 0, 0, 0) + 4c
>  fefffd30 cond_wait (32ecc8, 32ec98, 0, 0, fe460a40, 0) + 10  fefffd6c 
> pthread_cond_wait (32ecc8, 32ec98, 0, 0, 32ec98, 0) + 8
>  0008e674 ap_queue_pop (32ec78, fe30bf1c, fe30bf18, 4, 0, 32ee40) + 64  
> 0008be1c worker_thread (32eea8, 2, fe460a40, c8400, c8400, 0) + 10c  
> ff21440c dummy_worker (32eea8, 0, 0, fe460a40, ff214400, 1) + c
>  ff005850 _lwp_start (0, 0, 0, 0, 0, 0)
> -----------------  lwp# 4 / thread# 4  --------------------
>  ff0058d4 lwp_park (0, 0, 0)
>  fefff6e8 cond_wait_queue (32ecc8, 32ec98, 0, 0, 0, 0) + 4c
>  fefffd30 cond_wait (32ecc8, 32ec98, 0, 0, fe461240, 11692d8) + 10  
> fefffd6c pthread_cond_wait (32ecc8, 32ec98, 0, 0, 32ec98, 0) + 8
>  0008e674 ap_queue_pop (32ec78, fe20bf1c, fe20bf18, 0, 0, 32ee40) + 64  
> 0008be1c worker_thread (32eec8, 2, fe461240, c8400, c8400, 4) + 10c  
> ff21440c dummy_worker (32eec8, 0, 0, fe461240, ff214400, 1) + c
>  ff005850 _lwp_start (0, 0, 0, 0, 0, 0)
> 
> ...and several more in lwp_park.

The abopve one could be related to the cited BZ issue.

> And here's another one that's a day old, but looks different (including lots of jk references):
> 
> 7934:   /usr/local/apache2/bin/httpd -k start
> -----------------  lwp# 1 / thread# 1  --------------------  ff00a42c 
> lwp_wait (6, ffbff80c)
>  ff001e88 _thrp_join (6, 0, ffbff874, 1, ff0b2780, ffbff80c) + 38
>  ff214544 apr_thread_join (ffbff8f4, 28e228, 2, 0, 1, b1600) + c  
> 0008c43c join_workers (c, 3c5f38, 8bfcc, 28df50, 0, 1) + ec
>  0008c790 child_main (0, 8b31c, 0, feee2a40, ff0b2840, ff0b2780) + 270
>  0008c970 make_child (c7800, 0, 0, c8800, c7000, c8400) + 128
>  0008d1b4 ap_mpm_run (fe4100f8, e, 0, 1, 26, 1) + 754
>  000343c0 main     (d6218, d8190, ffbffc5c, c7800, c7800, 0) + 79c
>  00033754 _start   (0, 0, 0, 0, 0, 0) + 5c
> -----------------  lwp# 6 / thread# 6  --------------------
>  ff00a14c read     (15, fe00a908, 4)
>  fe4a87dc jk_tcp_socket_recvfull (15, fe00a908, 4, 2e4bf8, 510, 4ec) + 
> 74
>  fe4c3088 ajp_connection_tcp_get_message (35f130, 35f168, 2e4bf8, 
> 361188, 2000, 2064) + 44
>  fe4c5588 ajp_get_reply (361168, fe00bb50, 2e4bf8, 35f130, fe00aa70, 
> 2028) + 9c
>  fe4c9304 ajp_service (361168, fe00bb50, 2e4bf8, fe00ab38, 1, c00) + 
> 22b8
>  fe4a1234 jk_handler (23c, 35e740, 3f4390, 1, 13, 3544c8) + 9e4
>  00047534 ap_run_handler (3f40a0, 0, 11, 3e7028, 3f5a08, 0) + 3c
>  000479c0 ap_invoke_handler (3f40a0, 9d000, 3f40a0, 0, fe410028, 0) + 
> c0
>  00073aa4 ap_process_request (3f40a0, 3, 4, 3f40a0, c8420, 21d8d8) + 
> 160
>  00070b34 ap_process_http_connection (3d52e8, 3d5038, 3d5038, 3, 
> c8420, 211980) + 10c
>  0004dce8 ap_run_process_connection (3d52e8, 3d5038, 3d5038, 3, 
> 3d52e0, 3d7068) + 3c  0008bf1c worker_thread (28e228, 0, fe462240, 
> c8400, c8400, c) + 20c  ff21440c dummy_worker (28e228, 0, 0, fe462240, 
> ff214400, 1) + c
>  ff005850 _lwp_start (0, 0, 0, 0, 0, 0)
> -----------------  lwp# 7 / thread# 7  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 8 / thread# 8  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 9 / thread# 9  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 10 / thread# 10  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 11 / thread# 11  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 12 / thread# 12  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 13 / thread# 13  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **
> -----------------  lwp# 14 / thread# 14  --------------------
>  ff214400 dummy_worker(), exit value = 0x00000000
>         ** zombie (exited, not detached, not yet joined) **

This one seems to wait for an answer from the Tomcat backend.

You could use pfiles to check, which AJP connections are still open form that process and
then check the corresponding Tomcat instances with thread dumps (or jut a look at the current
requests registered in JMX, e.g. via the manager webapp JMXProxy access) which requests are
still being processed in them. Then you need to decide, whether you want to add a request
timeout to the mod_jk configuration and also set max_reply_timeouts.

If you can't find out the request from the Tomcat side of things, then you can also attach
to the httpd process with a debugger, switch to thread 6, walk up th stack until e.g. the
jk_handler, and print out the request object which was passed to it.

Regards,

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message