tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: mod_jk, not sure if an issue
Date Sat, 25 Apr 2009 11:58:31 GMT
Hi Rainer.

Thanks very much for the time you took to analyse the case below.
However, I am still puzzled :

This is indeed a system where there appear to be problems at the level 
of the connection between the client workstations/browser, and the 
distant server.  Those are the "client write" errors.
We are in the process (right, with the network people), of 
analysing/discussing that (automated tests, Wireshark etc..).

However, it is also a single host with both Apache and mod_jk and Tomcat 
on the same host.  So what I am troubled about is the "cping/cpong" and 
"sendfull" kind of messages. Are they not strictly between mod_jk and 
Tomcat ?  And if yes, considering these are on the same host, do these 
messages also indicate that something else is wrong ?

André


Rainer Jung wrote:
> Hi André,
> 
> On 22.04.2009 18:29, André Warnier wrote:
>> As (maybe) part of another issue which I am still trying to track down
>> with the concerned network people (client write errors, Sample 2 below),
> 
> Before commenting Sample 1: Although we generally respond to the "client
> errors" topic with "Users press reload or click on another link before
> answer returns - check whether your responses take to long", we now
> observed two cases, where there were real network problems (I generally
> avoid to bash the network people, but it is still possible that the
> trouble comes from the network):
> 
> - once a bad cable
> - once a bad driver for a network card used by a VMware guest system
> (problems with TCP offloading). TCP offloading problems seem to become a
> new trend :(
> 
>> I find the following kind of messages regularly in the mod_jk logfile
>> (Sample 1).
>> I just want to know if this indicates a problem somewhere, or if these
>> are normal occurrences.
>> These [info] messages do not always come as often as shown below, but
>> seem to come with some regularity nevertheless.
>>
>> Basically, my question is whether this indicates that there might be a
>> discrepancy between front-end and back(-)end or something of the kind,
>> or if there is some other parameter to adjust to make these go away.
>>
>> Apache 2.x prefork MPM
>> mod_jk 1.2.27
>> Tomcat 5.5.x
>> under RHEL, single host
>>
>> JkLogLevel  info
>> and
>> worker.ajp13.port=8009
>> worker.ajp13.host=localhost
>> worker.ajp13.type=ajp13
>> # new options since 1.2.27 :
>> worker.ajp13.ping_mode=A
>> (single worker, no load balancing)
>>
>> back-end Connector :
>> <Connector port="8009"
>>  enableLookups="false" redirectPort="8443" protocol="AJP/1.3"
>>  maxThreads="150" minSpareThreads="25" maxSpareThreads="100"
>> backlog="100" />
> 
> The configuration contains no means for shutting down idle workers.
> Neither mod_jk (connection_pool_timeout) nor Tomcat (connectTimeout)
> will close idle connections. So it's possible, that an intermediate
> system, e.g. firewall drops connections.
> 
>> Sample 1 :
>> --------
>>
>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.007701
>> 4897
>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.004344
>> 1457
>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.101926
>> 1945
>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.322939
>> 296975
>> [Tue Apr 21 13:35:14 2009] POST /myapp/normen/servlet.myapp 200 0.032625
>> 1576
>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.003084
>> 4176
> 
> It would be nicer to put the JK access logging into the access log and
> add the pid (process id) there, so you could identify exaclty which
> access log line relates to which JK info/error message block. not
> critical though.
> 
>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 16 returned -32 (errno=32)
>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> As you already noticed, errno=32 is a broken pipe error. Sending the
> cping packet uses the write() system call. The linux man pages for
> write(2) on my systems say:
> 
> EPIPE  fd is connected to a pipe or socket whose reading end is closed.
>  When this happens the writing process will also receive a SIGPIPE
> signal.  (Thus, the write return value is seen only if the program
> catches, blocks or ignores this signal.)
> 
> So we should have "reading end is closed", which is Tomcat, so the
> connection should bw in CLOSE_WAIT from the point of view of mod_jk. I
> would expect a RST send by a firewall or similar between mod_jk and
> Tomcat to also result in EPIPE.
> 
>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.010538
>> 1366
>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002768
>> 2507
>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002658
>> 1999
>> [Tue Apr 21 13:35:17 2009] POST /myapp/normen/servlet.myapp 200 0.060688
>> 21560
>> [Tue Apr 21 13:35:19 2009] POST /myapp/normen/servlet.myapp 200 0.017692
>> 23635
>> [Tue Apr 21 13:35:22 2009] POST /myapp/normen/servlet.myapp 200 1.032645
>> 21400
>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 16 returned -32 (errno=32)
>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> Same again.
> 
>> [Tue Apr 21 13:35:34 2009] POST /myapp/normen/servlet.myapp 200 0.071825
>> 20753
>> [Tue Apr 21 13:35:36 2009] POST /myapp/normen/servlet.myapp 200 0.115782
>> 14137
>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 1 returned -32 (errno=32)
>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> Again
> 
>> [Tue Apr 21 13:35:56 2009] POST /myapp/normen/servlet.myapp 200 0.040233
>> 20753
>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 16 returned -32 (errno=32)
>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> Again
> 
>> [Tue Apr 21 13:35:57 2009] POST /myapp/normen/servlet.myapp 200 0.036396
>> 12264
>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.292587
>> 1409
>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.010833
>> 6729
>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.015750
>> 3484
>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.032983
>> 2915
> 
> Note that all 4 cases were logged by a diffrent PID (process id), and in
> the prefork case, each process has its own pool of size 1. So 4
> different processes experienced a connection closed by the backend or
> something in between. If you would log the JK access data and the PID in
> the apache access log, you could verify, when those processes used the
> backend connection the last time previously to verify the idleness theory.
> 
>> Sample 2 : (for illustration of the other issue only)
>> --------
>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.016737
>> 6729
>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.019133
>> 3484
>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.010458
>> 2915
>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.112756
>> 18897
>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 1 returned -32 (errno=32)
>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> The same for pid 29831
> 
>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 17 returned -32 (errno=32)
>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> pid 30401
> 
>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>> socket 16 returned -32 (errno=32)
>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>> socket -1 keepalive cping/cpong failure (errno=32)
> 
> pid 29827
> 
>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>> ajp_process_callback::jk_ajp_common.c (1748): Writing to client aborted
>> or client network problems
>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>> ajp_service::jk_ajp_common.c (2407): (ajp13) sending request to tomcat
>> failed (unrecoverable), because of client write error (attempt=1)
> 
> Now a client write error, again pid 29831. Possibly, the cping write
> error above too a long time to be detected and that's why the response
> retrieved via a fresh connection wasn't consumed by the client? Although
> the below response times seem to tell us, that the longest request in
> the snippet here took 9 seconds, which is not that long.
> 
> There is no mod_jk internal relation between the cping send failure and
> a client write error. It could only be in some indirect way, like both
> being caused by a general network problem (occuring on the path to the
> browser and on the path to the backend - but then why don't we see lots
> of other communication errors as well?) or by some timing issue
> triggering user behaviour - not very likely in light of the blow
> duration lines.
> 
>> [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 5.210358
>> 8184
>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>> jk_handler::mod_jk.c (2469): Aborting connection for worker=ajp13
> 
> This is a consequence of the client write error, so next time this
> process will open a fresh connection to the backend.
> 
>> [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 0.363977
>> 20753
>> [Tue Apr 21 13:15:54 2009] POST /myapp/normen/servlet.myapp 200 9.057331
>> 20755
>> [Tue Apr 21 13:15:55 2009] POST /myapp/normen/servlet.myapp 200 0.215937
>> 14166
> 
> Regards,
> 
> Rainer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message