tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kees Jan Koster <kjkos...@gmail.com>
Subject [solved] Re: connection reset errors
Date Sun, 10 Jun 2012 19:09:23 GMT
Dear All,

Well, I managed to track this down. As it turned out, the problem was that I had a rather
short TCP listen queue on the Tomcat connector port (100 elements) and that queue was overflowing.

The solution was to 1) set acceptCount to a higher value in Tomcat and 2) to configure my
OS to allow applications to specify longer accept queues. That last step was the one missing.
I had changed acceptCount before, but since the OS was limiting the accept queue length I
did not see any improvement.

More details can be found here:

http://java-monitor.com/forum/showthread.php?t=2492

A big thank you for all that contributed to this thread and helped me understand the problem.

Kees Jan


On 22 May 2012, at 14:45, André Warnier wrote:

> Kees Jan Koster wrote:
>> Dear André,
>>> Assuming that your client is really connecting to that HTTP connector on port
8080 mentioned above..
>> Yes, it has a forwarded port 80 (using FreeBSD ipfw) that also points to 8080, and
there is an Apache with mod_proxy_http that hooks into 8081. My tests are on the vanilla port,
though.
> 
> Can you be a bit clearer on this part ?  Do you see the problem happening for 1 in 10
posts, when your client connects directly to Tomcat's HTTP port 8080 ?
> Or is it only when the client connects to Tomcat via either one of these intermediate
pieces of machinery ?
> 
>>> 1) You are getting a
>>> java.net.SocketException: Connection reset
>>> 	at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>> 
>>> so this appears to happen when/while your java client is reading the response
from the server, and it appears to be that the client is expecting to be able to read more
data, but finds itself unable to, because the socket has been closed "under his nose".
>> The reading is one area I need to look into: did the client get all data, partial
data or none at all. I need to experiment with that.
>>> You say that it happens "frequently", so it's not always.
>> Indeed, not always. About 1 in 10 posts die like this on bad days. Sometimes hours
with no issues. No pattern I can discern.
>>> 2) the server itself seems unaware that there is a problem.  So it has already
written the whole response back to the client, decided it was done with this request, and
gone happily to handle other things.
>> Precisely.
>>> That can happen, even if the client has not yet received all data, because between
the server and the client there is a lot of piping, and the data may buffered at various levels
or still "in transit".
>>> 
>>> thus..
>>> 
>>> - either the client is misinterpreting the amount of data that it should be reading
from the server's response (trying to read more than there actually is)
>>> (on the other hand, I think that the kind of exception you would get in that
case would be different, more like "trying to read beyond EOF" or so).
>>> - or something in-between the server and the client closes the connection before
all data has been returned to the client (and/or is loosing data).
>>> 
>>> It would be helpful to know if this happens when the response is particularly
large, or small, or if it is unrelated to the response size.
>> The response is a few bytes. I think it is about 10-20 bytes. Less than a packet,
I expect. :)
> 
> That is quite strange, I think.
> See below.
> 
> 
>>> If the server is configured with an AccessLogValve, you should be able to see
how big the response was, in bytes.  If you have control over the client code, you should
be able to add something that logs how many bytes it has read before the exception occurs.
>> What makes the request size interesting? What previous experience are you basing
this question on?
> 
> Just that intuitively, if a problem happens while reading the response, one would expect
that the larger the response is, the more likely that some network issue would show up in
the middle.
> 
> But now that I say this, going back to your initial message and the stacktrace in it,
I see
> ..
> at sun.net.www.http.HttpClient.parseHTTPHeader
> ...
> so the problem seems to show up right away, while the response's HTTP *headers* are being
read.  So it looks like when the problem happens, the client is not able to read anything
at all, not even the headers..
> 
> Do all problems show up the same stacktrace, all with a problem while reading/parsing
the response headers ?
> 
> 
>>> Dumping the response HTTP headers to the client logfile would also help finding
out what happens. (If the client is an applet running inside of a browser, then a browser
add-on would show this easily (like "Live HTTP Headers" for Firefox, or Fiddler2 for IE)).
>> I can check that I see the same problems from a browser using firebug, that is a
good idea. Thanks.
>>> Doing a "traceroute" from the client to the server,  may also give an idea of
what there is actually between the server and the client.
>> mtr reports no packet loss between the two machines I used for testing.
> 
> Actually, I was more thinking about some intermediate problematic proxy or something.
> But a traceroute or similar would not show that.
> 
> See the first question above, about the direct/indirect connection client-server.
> 
>>> And if this all still does not provide any clues, then you're down to a network
packet trace, using Wireshark or similar.
>> Packet traces I was hoping to avoid. :(
> 
> So far it smells to me like there is some network issue, with some intermediate software
or hardware part which is dropping the connection between client and server, after the client
has sent the request, but before it even starts receiving the response.
> Is there anything in-between client and server which could have this behaviour, such
as when it gets very busy ?
> Do you have any kind of tool which can show you how many requests Tomcat is processing
over time, and if these problems happen when it is handling lots of requests ?
> (Not that the problem appears to be at the Tomcat level, but just to check how busy the
network may be at such times)
> 
> Another thing : your client is effectively requesting non-keepalive connections, so Tomcat
will close the connection after sending the response to each request.  And your clients have
to rebuild a new connection for each request.
> If the same client(s) make lots of small requests one after another, this may be counter-productive,
because each connection build-up requires several packets going back and forth. Also, on the
server side, when a connection is being closed, it will nevertheless "linger" for a while
in CLOSE_WAIT state, waiting for the client's TCP stack to acknowledge the CLOSE.  I have
seen cases where a large number of such connections being in CLOSE_WAIT triggered bizarre
issues, such as a server becoming unable to accept new TCP connections for a while.
> It may be worth checking how many of such CLOSE_WAIT connections you have over time,
and if this relates to when the problems happen.
> netstat -pan | grep CLOSE_WAIT
> would show this. If more than a couple of hundreds show up, I'd become suspicious of
something like that.
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


--
Kees Jan

http://java-monitor.com/
kjkoster@kjkoster.org
+31651838192

The secret of success lies in the stability of the goal. -- Benjamin Disraeli


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message