Return-Path: Delivered-To: apmail-ws-axis-user-archive@www.apache.org Received: (qmail 28547 invoked from network); 3 Nov 2003 14:54:34 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 3 Nov 2003 14:54:34 -0000 Received: (qmail 24489 invoked by uid 500); 3 Nov 2003 14:54:19 -0000 Delivered-To: apmail-ws-axis-user-archive@ws.apache.org Received: (qmail 24474 invoked by uid 500); 3 Nov 2003 14:54:19 -0000 Mailing-List: contact axis-user-help@ws.apache.org; run by ezmlm Precedence: bulk Reply-To: axis-user@ws.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list axis-user@ws.apache.org Received: (qmail 24460 invoked from network); 3 Nov 2003 14:54:19 -0000 Message-ID: From: "Wang, Pengyu [IT]" To: "'axis-user@ws.apache.org'" Subject: RE: Too many CLOSE_WAIT socket connections Date: Mon, 3 Nov 2003 09:54:07 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2657.72) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.36 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N By default java.net package of HttpURLConnection is using Http1.1 which is keep live connection. This will cause you CLOSE_WAIT since from the=20 client side you are not closing the socket (keep-live), and from the=20 server side, it will take some time to figure out that you are no = longer using the connection so it just close it (after a long time). Then = client is hung on CLOSE_WAIT since it tries to wait for another state before = it=20 gave up (I don't remember which one now, better to pick up my TCPIP = book). The best way to observe this is using TCPMon to see if you are using=20 Keep-alive header. This is specifically true for Apache webserver, since I have to due = with similar issue on embedded C++ apache server before. The way I get = around is=20 set Java.net package not to use keep live header and setSoTimeout to a = lower threshold. Another parameter is SO_LINGER, but I don't seem to see the obvious=20 effect if the above two have been set. -----Original Message----- From: Matteo Tamburini [mailto:mtf@fastwebnet.it] Sent: Saturday, November 01, 2003 9:53 AM To: axis-user@ws.apache.org Subject: R: Too many CLOSE_WAIT socket connections Mike, thank You for your answer.=20 I'm using Linux, and I don't care about Windows.=20 Actually, my problem is not related to the web server, but it's related = to the client.=20 My client side CLOSE_WAIT sockets persist for a very long time: I left = my Java application running several hours (about one night) and in the = next morning I found the same number of CLOSE_WAIT socket of the evening = before, and my Java app gave exceptions for the whole night about the fact that there was no way to get another socket from the OS.=20 This makes me think that most likely it's not a problem related to a = timeout parameter, but something related to an unreleased socket, somewhere. >From netstat manpage you can read: CLOSE_WAIT: The socket connection has been closed by the remote = peer, and the system is waiting for the local application to close its = half of the connection. As you see, this means that the OS is not automatically closing the = socket until the process who requested the socket doesn't relese it. Perhaps, = the reason is that from the OS's point of view, the process may be pooling = its sockets someway, so why release its sockets? I think the timeout you suggest me to use is related to the time waited = by the OS before freeing the socket when the owner process does not exist = any more (ie: killing -9 the process makes him not release sockets. So = after a timeout the OS frees the sockets owned by that process) or the time = waited by the OS before releasing sockets in TIME_WAIT state. From man = netstat:=20 TIME_WAIT: The socket connection has been closed by the local application, the remote peer has closed its half of the = connection, and the system is waiting to be sure that the remote peer received the = last acknowledgement. I've read about the fact that this parameter was once called tcp_close_wait_interval, but the name was not correct and brought many people in confusion, so it was renamed tcp_time_wait_interval. So I = don't think it's related to the CLOSE_WAIT socket state. Is it correct? Anyway, Monday I'll try to reduce significantly this parameter, then = I'll let you know. Any more idea? In the meantime, thank you Mike. Bye, Matteo. > -----Messaggio originale----- > Da: Mike Burati [mailto:mburati@bowstreet.com]=20 > Inviato: venerd=EC 31 ottobre 2003 20.01 > A: 'axis-user@ws.apache.org' > Oggetto: RE: Too many CLOSE_WAIT socket connections >=20 >=20 > Both Unix and Windows appear to have TCP Time Wait timeouts=20 > set too high by default (about 5 minutes) where the OS leaves=20 > closed sockets in the CLOSE_WAIT state queuing up=20 > (which can=20 > make it real easy to hit the maxfd limit on a heavily loaded=20 > web server). > I believe the value is something like tcp_time_wait_interval=20 > (kernel param) on Unix systems, and I can't remember what the=20 > Windows Registry key is for the equivalent setting, but it's=20 > name is similar. > Set those smaller (eg, 30 or 60 seconds) and you should avoid=20 > the problem you're seeing.