Return-Path: Delivered-To: apmail-hc-dev-archive@www.apache.org Received: (qmail 11629 invoked from network); 25 Jun 2008 16:12:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Jun 2008 16:12:31 -0000 Received: (qmail 6881 invoked by uid 500); 25 Jun 2008 16:12:32 -0000 Delivered-To: apmail-hc-dev-archive@hc.apache.org Received: (qmail 6861 invoked by uid 500); 25 Jun 2008 16:12:32 -0000 Mailing-List: contact dev-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpComponents Project" Delivered-To: mailing list dev@hc.apache.org Received: (qmail 6850 invoked by uid 99); 25 Jun 2008 16:12:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jun 2008 09:12:31 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eric.hubert@jamba.net designates 81.19.200.43 as permitted sender) Received: from [81.19.200.43] (HELO r2d2.jamba.net) (81.19.200.43) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jun 2008 16:11:42 +0000 Received: from BERWNEXCN02.jcorp.ad.jamba.net (berwnexcn02.jcorp.ad.jamba.net [10.35.12.219]) by r2d2.jamba.net (Postfix) with ESMTP id 29C7FD95F70; Wed, 25 Jun 2008 18:12:00 +0200 (CEST) Received: from berwnexmb01.jcorp.ad.jamba.net ([10.35.12.221]) by BERWNEXCN02.jcorp.ad.jamba.net with Microsoft SMTPSVC(6.0.3790.3959); Wed, 25 Jun 2008 18:14:08 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: Possible Causes for "Connection reset by peer" when using NIO Date: Wed, 25 Jun 2008 18:13:45 +0200 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Possible Causes for "Connection reset by peer" when using NIO Thread-Index: AcjW3nE84KgaA+i6S8OhDVN7lufHNw== From: "Hubert, Eric" To: Cc: X-OriginalArrivalTime: 25 Jun 2008 16:14:08.0015 (UTC) FILETIME=[7EE611F0:01C8D6DE] X-Virus-Checked: Checked by ClamAV on apache.org Hi devs! first of all I'd like to apologize for posting a "user-problem" to two = dev-lists. I only did this as have not much background knowledge of the = NIO implementation and think a solid understanding of NIO is necessary = to help tackling our problem. We are using the WSO2 ESB which is based on Apache Synapse, Apache Axis2 = and the HTTP Core NIO module. As the stacktrace only contains http-nio = details, I cc'ed the http components dev list. Hopefully someone can = help out. When sending about 3000 Hessian-requests per hour from clients (Tomcat) = over the ESB (Synapse 1.2 running on JDK 1.5.15, Linux = 2.6.23.1-amd64-75) to a Bea Weblogic 8.1 we see about 1 to 10 exceptions = of type "java.io.IOException: Connection reset by peer" in the ESB-log.=20 If I understand it right the ESB then executes a failover to the next = service node as we are using a load balancing group. So the client is = not affected, but the endpoint with the failure will be marked as = inactive. The problem is I don't understand the cause of this exception. It occurs = during the read on a Socket-Channel. So I think the server might close = the connection while the ESB is reading. But maybe internally some kind = of pool is used and a connection can change to some abnormal state? We have seen such Exceptions before when we were using HTTP 1.1 in = combination with the Bea Weblogic server. Very likely an issue with HTTP = keepalive (persistent connections). So for any connection to a Bea = service we use the property mediator of Synapse to change the connection = ESB <-> Bea to use HTTP 1.0: Since then we hadn't seen this exception again. But now switching to = another environment we see this exception again, but only for Hessian = services. I have no clue what else could cause this exception. How can we detect = the cause? How to narrow down possible causes, if there are different = possibilities. I don't expect any network outages to be the reason, as = other services (SOAP)-based are working pretty well. The exact exception we are getting is: java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207) at = org.apache.http.impl.nio.reactor.SessionInputBufferImpl.fill(SessionInput= BufferImpl.java:85) at = org.apache.http.impl.nio.codecs.AbstractMessageParser.fillBuffer(Abstract= MessageParser.java:97) at = org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(Defaul= tNHttpClientConnection.java:113) at = org.apache.http.impl.nio.DefaultClientIOEventDispatch.inputReady(DefaultC= lientIOEventDispatch.java:99) at = org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.jav= a:98) at = org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractI= OReactor.java:195) at = org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(Abstract= IOReactor.java:180) at = org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReac= tor.java:142) at = org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java= :70) at = org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(= AbstractMultiworkerIOReactor.java:318)=A0 This exception occurs consistently a few time per hour on every possible = combination of client node, esb node and service endpoint node. Any pointer or idea is greatly appreciated. Thanks a lot in advance! Regards, Eric --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org For additional commands, e-mail: dev-help@hc.apache.org