tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: AJP communication failures
Date Wed, 30 May 2012 09:22:48 GMT

Thank you for all the very detailed information provided.

 From what I can see in the logs, at this point I would have to say that my impression is

that this is a problem buried fairly deep in the TCP/IP stack, and both 
Apache+mod_proxy_ajp, and Tomcat, may just be suffering the consequences of an underlying

  TCP/IP issue (or of a Windows NLB "feature").

In the logs, you have messages like : Software caused connection abort: socket write error

which is something that comes from the JVM running Tomcat (and even probably from native 
code in the JVM).

Similarly, messages in Apache httpd's logs like

[Tue May 29 15:29:43 2012] [error] (OS 10060)A connection attempt failed because the 
connected party did not properly respond after a period of time, or established connection

failed because connected host has failed to respond.  : ajp_ilink_receive() can't receive

[Tue May 29 15:29:43 2012] [error] ajp_read_header: ajp_ilink_receive failed
[Tue May 29 15:29:43 2012] [error] (70007)The timeout specified has expired: proxy: dialog

to ( failed

look to me like OS-level error conditions, just forwarded by Apache to the logs (at least

the (OS 10060) prefix looks like a Windows error code).

I've read a bit about Windows NLB (just right now, to find out what it is), and it seems 
to me that there at least /a possibility/ that combining this with another kind of 
load-balancing (as you do with mod_proxy_ajp) may not be the most stable configuration.
 From the logs, it really looks as if both the Apache and Tomcat softwares occasionally 
find themselves with a suddenly non-existent connection, where ping packets are not being

returned, and/or a read or write socket suddenly becomes unresponsive.

I know that you mentioned that these httpd/tomcat connections are being done on the 
respective hosts "private addresses", and I can see in the logs that the problems happen 
even on the host's local loop address But on the other hand, setting up NLB 
seems to involve a common IP stack driver buried fairly deep in the protocol stack of each

host (and "affinity" parameters), and who knows what that thing is doing, or not doing.

Just to give an idea - and I realise that this article may have no direct relevance 
whatsoever to the present issue - see :
In this case, they are talking about the installation of some software package resulting 
indirectly in shortening the packet MTU, and this indirectly causing problems with some 
webserver functions.  Just to say that you may be faced with some deep issue like this, 
because of the NLB implementation.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message