Return-Path: Delivered-To: apmail-geronimo-activemq-users-archive@www.apache.org Received: (qmail 17885 invoked from network); 11 Aug 2006 12:05:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 11 Aug 2006 12:05:57 -0000 Received: (qmail 5152 invoked by uid 500); 11 Aug 2006 12:05:56 -0000 Delivered-To: apmail-geronimo-activemq-users-archive@geronimo.apache.org Received: (qmail 5134 invoked by uid 500); 11 Aug 2006 12:05:56 -0000 Mailing-List: contact activemq-users-help@geronimo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: activemq-users@geronimo.apache.org Delivered-To: mailing list activemq-users@geronimo.apache.org Received: (qmail 5125 invoked by uid 99); 11 Aug 2006 12:05:56 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from [217.205.133.132] (HELO srv002.hpd.co.uk) (217.205.133.132) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Aug 2006 05:05:55 -0700 Received: by srv002.hpdsoftware.com with Internet Mail Service (5.5.2658.3) id ; Fri, 11 Aug 2006 13:04:14 +0100 Message-ID: <6FB083FB72EFD21181D30004AC4CA18A0AEA0CF1@srv002.hpdsoftware.com> From: Charles Anthony To: "'activemq-users@geronimo.apache.org'" Subject: TCP Connection Timeout Problems. Possibly. Date: Fri, 11 Aug 2006 13:04:13 +0100 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2658.3) Content-Type: text/plain X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi All, We've just had a nasty situation : our ActiveMQ Server standalone plain vanilla TCP Transport, no persistency, no nuffink) on one of our live installations suddenly refused to accept any new connections - no clients could connect. All currently connected clients were fine, and messages were being processed sent and received fine. Just no-one else could connect. After 20 minutes, new connections were suddenly allowed. The following exception was in our log. 2006-Aug-11 12:17:47.726 aqualive [ActiveMQ Transport Server: tcp://blah:61616] ERROR org.apache.activemq.broker.TransportConnector - Could not accept connection: java.net.SocketException: Connection reset by peer: socket write error java.net.SocketException: Connection reset by peer: socket write error at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedO utputStream.java:108) at java.io.DataOutputStream.flush(DataOutputStream.java:101) at org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:125) at org.apache.activemq.transport.InactivityMonitor.oneway(InactivityMonitor.jav a:141) at org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormat Negotiator.java:128) at org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiato r.java:64) at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:52) at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:52) at org.apache.activemq.broker.TransportConnection.start(TransportConnection.jav a:75) at org.apache.activemq.broker.TransportConnector$1.onAccept(TransportConnector. java:136) at org.apache.activemq.transport.tcp.TcpTransportServer.run(TcpTransportServer. java:137) at java.lang.Thread.run(Thread.java:534) My interpretation of the above that something (port scanner maybe ? Our curious IT department ?) is connecting to the listening socket, and the TransportServer is trying to tell the connecting process what the wireformat is - and the connection process is just sitting there, not responding, acknlowedging, or doing anything at all - yet not closing the connection. Therefore, the transport server is blocked, preventing anyone else connecting. After 20 mins - which I am guessing is somekind of lowlevel timeout, seeing as all the default AMQ timeouts seen to be of the order of 1 - 30 secs - a low level TCP exception is thrown, freeing the whole shebang up. I notice there is an InactivityMonitor, and looking at the code there is the following comment // Disable inactivity monitoring while processing a command. Could this be the case ? That - until the wireformat has been negotiated - there is no timeout configured ? Is there anything we can do to reduce this timeout from 20 mins ? Or have I completed gone down the wrong track ? This is AMQ 4.0, Win2K, JRE 1.4.2 Cheers, Charles ___________________________________________________________ HPD Software Ltd. - Helping Business Finance Business Email terms and conditions: www.hpdsoftware.com/disclaimer