axis-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Pengyu [IT]" <pengyu.w...@citigroup.com>
Subject RE: Too many CLOSE_WAIT socket connections
Date Mon, 03 Nov 2003 14:54:07 GMT
By default java.net package of HttpURLConnection is using Http1.1 which
is keep live connection. This will cause you CLOSE_WAIT since from the 
client side you are not closing the socket (keep-live), and from the 
server side, it will take some time to figure out that you are no longer
using the connection so it just close it (after a long time). Then client
is hung on CLOSE_WAIT since it tries to wait for another state before it 
gave up (I don't remember which one now, better to pick up my TCPIP book).
The best way to observe this is using TCPMon to see if you are using 
Keep-alive header.


This is specifically true for Apache webserver, since I have to due with
similar issue on embedded C++ apache server before. The way I get around is 
set Java.net package not to use keep live header and setSoTimeout to a lower

threshold. Another parameter is SO_LINGER, but I don't seem to see the
obvious 
effect if the above two have been set.



-----Original Message-----
From: Matteo Tamburini [mailto:mtf@fastwebnet.it]
Sent: Saturday, November 01, 2003 9:53 AM
To: axis-user@ws.apache.org
Subject: R: Too many CLOSE_WAIT socket connections


Mike, thank You for your answer. 
I'm using Linux, and I don't care about Windows. 
Actually, my problem is not related to the web server, but it's related to
the client. 
My client side CLOSE_WAIT sockets persist for a very long time: I left my
Java application running several hours (about one night) and in the next
morning I found the same number of CLOSE_WAIT socket of the evening before,
and my Java app gave exceptions for the whole night about the fact that
there was no way to get another socket from the OS. 
This makes me think that most likely it's not a problem related to a timeout
parameter, but something related to an unreleased socket, somewhere.

>From netstat manpage you can read:
     CLOSE_WAIT:  The socket connection has been closed by the remote peer,
     and the system is waiting for the local application to close its half
of
     the connection.

As you see, this means that the OS is not automatically closing the socket
until the process who requested the socket doesn't relese it. Perhaps, the
reason is that from the OS's point of view, the process may be pooling its
sockets someway, so why release its sockets?
I think the timeout you suggest me to use is related to the time waited by
the OS before freeing the socket when the owner process does not exist any
more (ie: killing -9 the process makes him not release sockets. So after a
timeout the OS frees the sockets owned by that process) or the time waited
by the OS before releasing sockets in TIME_WAIT state. From man netstat: 
     TIME_WAIT:	 The socket connection has been closed by the local
     application, the remote peer has closed its half of the connection, and
     the system is waiting to be sure that the remote peer received the last
     acknowledgement.

I've read about the fact that this parameter was once called
tcp_close_wait_interval, but the name was not correct and brought many
people in confusion, so it was renamed tcp_time_wait_interval. So I don't
think it's related to the CLOSE_WAIT socket state.

Is it correct?
Anyway, Monday I'll try to reduce significantly this parameter, then I'll
let you know.

Any more idea?

In the meantime, thank you Mike.

Bye,
Matteo.


> -----Messaggio originale-----
> Da: Mike Burati [mailto:mburati@bowstreet.com] 
> Inviato: venerdì 31 ottobre 2003 20.01
> A: 'axis-user@ws.apache.org'
> Oggetto: RE: Too many CLOSE_WAIT socket connections
> 
> 
> Both Unix and Windows appear to have TCP Time Wait timeouts 
> set too high by default (about 5 minutes) where the OS leaves 
> closed sockets in the CLOSE_WAIT state queuing up 
> (which can 
> make it real easy to hit the maxfd limit on a heavily loaded 
> web server).
> I believe the value is something like tcp_time_wait_interval 
> (kernel param) on Unix systems, and I can't remember what the 
> Windows Registry key is for the equivalent setting, but it's 
> name is similar.
> Set those smaller (eg, 30 or 60 seconds) and you should avoid 
> the problem you're seeing.

Mime
View raw message