httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murray, Calum" <Calum_Mur...@intuit.com>
Subject [users@httpd] Intermittent "hang" of threads in apache server
Date Fri, 27 Sep 2013 21:49:49 GMT

We are seeing a very intermittent problem where a request hangs in the server for exact multiples
of 30 secs ( we have seen hangs of 30 – 180 secs). This usually involves 2 threads on the
same pid hanging. It is in the server, or maybe between the server and mod_jk, but it is happening
in the server tier. At first we suspected connection pool/connection re-use in mod_jk so we
eventually disabled both. Connections are now created and dismantled for every request. We
can see the packets coming in at time t and then the request from apache being forwarded to
the app server tier at t+n*30. We see no issues with sockets establishing connections, indeed
we see the hang before the initial socket SYN is sent. So we now believe that it is not a
network issue

This is not a busy server, and so unlikely related to load. vmstat and sar both show nothing
going on on the system when this hang occurs. Garbage collection is also not an issue.

With apache logging on level trace8 for the tid with the problem, we see (*** marks removed
text) the clear delay between request/response.

[Wed Sep 25 10:51:00.188705 2013] [http:trace4] [pid 28203:tid 140532468623104] http_request.c(305):
[client ***:43707]   Connection: keep-alive, referer: ****
[Wed Sep 25 10:51:00.188712 2013] [http:trace4] [pid 28203:tid 140532468623104] http_request.c(305):
[client ***:43707]   If-Modified-Since: Wed, 11 Sep 2013 18:34:32 GMT, ****
[Wed Sep 25 10:51:00.188720 2013] [http:trace4] [pid 28203:tid 140532468623104] http_request.c(305):
[client ***:43707]   If-None-Match: W/\\"2610-1378924472000\\", referer: ****
[Wed Sep 25 10:51:00.188727 2013] [http:trace4] [pid 28203:tid 140532468623104] http_request.c(305):
[client ***:43707]   Cache-Control: max-age=0, referer: ***
[Wed Sep 25 10:51:00.188734 2013] [http:trace4] [pid 28203:tid 140532468623104] http_request.c(305):
[client ***:43707]   X-Forwarded-For: ****
[Wed Sep 25 10:51:00.188767 2013] [rewrite:trace2] [pid 28203:tid 140532468623104] mod_rewrite.c(468):
[client ***:43707] 206.108.42.133 - -**** init rewrite engine with requested uri ***
[Wed Sep 25 10:51:00.188784 2013] [rewrite:trace1] [pid 28203:tid 140532468623104] mod_rewrite.c(468):
[client ***:43707] 206.108.42.133 - - [****][rid#7fd030072280/initial] pass through ***, referer:
***
[Wed Sep 25 10:51:00.188812 2013] [authz_core:debug] [pid 28203:tid 140532468623104] mod_authz_core.c(828):
[client ***:43707] AH01628: authorization result: granted (no directives), referer: ****
[Wed Sep 25 10:51:00.188821 2013] [core:trace3] [pid 28203:tid 140532468623104] request.c(227):
[client ***:43707] request authorized without authentication by access_checker_ex hook: ***
[Wed Sep 25 10:51:00.188841 2013] [charset_lite:trace3] [pid 28203:tid 140532468623104] mod_charset_lite.c(211):
[client***:43707] uri: *** file: *** method: 0 imt: text/css flags:  (null)->(null), referer:
***
[Wed Sep 25 10:51:00.188852 2013] [charset_lite:debug] [pid 28203:tid 140532468623104] mod_charset_lite.c(219):
[client ***:43707] AH01448: incomplete configuration: src unspecified, dst unspecified, referer:
***
[Wed Sep 25 10:52:35.693776 2013] [core:trace6] [pid 28172:tid 140532468623104] core_filters.c(525):
[client ***:10778] core_output_filter: flushing because of FLUSH bucket
[Wed Sep 25 10:53:30.202645 2013] [deflate:trace1] [pid 28203:tid 140532468623104] mod_deflate.c(490):
[client ***:43707] Not compressing very small response of 0 bytes, referer: *<https://connect.intuit.com/view/3e52c5bc-5f0d-46c7-89dc-2799ecd7266e?cta=viewinvoicenow>**
[Wed Sep 25 10:53:30.202686 2013] [http:trace3] [pid 28203:tid 140532468623104] http_filters.c(963):
[client ***:43707] Response sent with status 304, headers:, referer: ***


mod_jk logging at debug: We see evidence of the hang there but no smoking gun. Again just
a delay of some multiple of 30 second.

Obviously the multiples of 30 seconds should be a clue (feels like a timeout?) but have not
yet been able to tie it out with anything.

We are running on VMWare – apache server 2.4.4 (yes, I know we are going to upgrade to the
latest….)

My question is whether anyone else has seen this issue before?



Mime
View raw message