Return-Path: X-Original-To: apmail-tomcat-users-archive@www.apache.org Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 92975DB7A for ; Wed, 7 Nov 2012 04:02:02 +0000 (UTC) Received: (qmail 37577 invoked by uid 500); 7 Nov 2012 04:01:59 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 37394 invoked by uid 500); 7 Nov 2012 04:01:58 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 37366 invoked by uid 99); 7 Nov 2012 04:01:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Nov 2012 04:01:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of asankha.apache@gmail.com designates 209.85.220.45 as permitted sender) Received: from [209.85.220.45] (HELO mail-pa0-f45.google.com) (209.85.220.45) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Nov 2012 04:01:51 +0000 Received: by mail-pa0-f45.google.com with SMTP id fb10so932438pad.18 for ; Tue, 06 Nov 2012 20:01:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=JddLaMRtd8s/xIIdBNejOvCPG4i+scwD3uRkSGi3PaQ=; b=HhlT5yF6ocLG6niv/cUciVUg/rT7UAmze0kAYi+YftQqt1DCV8RZA7ukCB/+JJXFTW SMLtyjTggd/BdrP+Z8hrj1kYexMtrgcR5CrmhABc7Z5CKFQB2fYxWH35rIYDpVZQrfkL PBNpo4WF/oVuvuWpGNjUjmK3gEex1zPBY/qvazbmS4hfjbSdibEP9LJTmmiHdYHe9Xal 3paOpgRQt+2pKxVCtG7hAnAKtcOjbBYpfRNYbGwm5PhKbeZXl7WTl+fhUAQb/q9yGDa1 xTATWS7DrEymaHev26tt/HASLR2m9h4x5rgYd0FuVW9iXloZN4f8/t7q3LD6eCKP4i+Q baEA== Received: by 10.68.217.130 with SMTP id oy2mr9639911pbc.144.1352260891504; Tue, 06 Nov 2012 20:01:31 -0800 (PST) Received: from [192.168.1.100] ([112.134.211.96]) by mx.google.com with ESMTPS id ai8sm6704806pbd.14.2012.11.06.20.01.29 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 06 Nov 2012 20:01:30 -0800 (PST) Sender: Asankha Perera Message-ID: <5099DD17.5000506@apache.org> Date: Wed, 07 Nov 2012 09:31:27 +0530 From: "Asankha C. Perera" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120912 Thunderbird/15.0.1 MIME-Version: 1.0 To: Tomcat Users List , Christopher Schultz Subject: Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT References: <50914A4E.501@christopherschultz.net> <509226F1.4070501@apache.org> <50941B48.8070906@apache.org> <5094244D.3070007@apache.org> <50942D6D.3060307@christopherschultz.net> <5097C0F8.8040906@apache.org> <50984AF5.2000408@christopherschultz.net> <5098C2B1.4020202@apache.org> <50996CAA.1050407@christopherschultz.net> In-Reply-To: <50996CAA.1050407@christopherschultz.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi Chris > My expectation from the backlog is: > > 1. Connections that can be handled directly will be accepted and work > will begin > > 2. Connections that cannot be handled will accumulate in the backlog > > 3. Connections that exceed the backlog will get "connection refused" > > There are caveats, I would imagine. For instance, do the connections in > the backlog have any kind of server-side timeouts associated with them > -- what is, will they ever get discarded from the queue without ever > being handled by the bound process (assuming the bound process doesn't > terminate or anything weird like that)? Do the clients have any timeouts > associated with them? > > Does the above *not* happen? On which platform? Is this only with NIO? I am not a Linux level TCP expert, but what I believe is that the TCP layer has its timeouts and older connection requests will get discarded from the queue etc. Typically a client will have a TCP level timeout as well, i.e. the time it will wait for the other party to accept its SYN packet. My testing has been primarily on Linux / Ubuntu. Leaving everything to the TCP backlog makes the end clients see nasty RSTs when Tomcat is under load instead of connection refused - and could prevent the client from performing a clean fail-over when one Tomcat node is overloaded. > So you are eliminating the backlog entirely? Or are you allowing the > backlog to work as "expected"? Does closing and re-opening the socket > clear the existing backlog (which would cancel a number of waiting > though not technically accepted connections, I think), or does it retain > the backlog? Since you are re-binding, I would imagine that the backlog > gets flushed every time there is a "pause". I am not sure how the backlog would work under different operating systems and conditions etc. However, the code I've shared shows how a pure Java program could take better control of the underlying TCP behavior - as visible to its clients. > What about performance effects of maintaining a connector-wide counter > of "active" connections, plus pausing and resuming the channel -- plus > re-connects by clients that have been dropped from the backlog? What the UltraESB does by default is to stop accepting new connections after a threshold is reached (e.g. 4096) and remain paused until the active connections drops back to another threshold (e.g. 3073). Each of these parameters are user configurable, and depends on the maximum number of connections each node is expected to handle. Maintaining connector wide counts in my experience does not cause any performance effects, neither re-connects by clients - as whats expected in reality is for a hardware load balancer to forward requests that are "refused" by one node, to another node, which hopefully is not loaded. Such a fail-over can take place immediately, cleanly and without any cause of confusion even if the backend service is not idempotent. This is clearly not the case when a TCP/HTTP connection is accepted and then met with a hard RST after a part or a full request has been sent to it. > I'm concerned that all of your bench tests appear to be done using > telnet with a single acceptable connection. What if you allow 1000 > simultaneous connections and test it under some real load so we can see > how such a solution would behave. Clearly the example I shared was just to illustrate this with a pure Java program. We usually conduct performance tests over half a dozen open source ESBs with concurrency levels of 20,40,80,160,320,640,1280 and 2560 and payload sizes of 0.5, 1, 5, 10 and 100K bytes. You can see some of the scenarios here http://esbperformance.org. We privately conduct performance tests beyond 2560 to much higher levels. We used a HttpComponents based EchoService as our backend service all this time, and it behaved very well with all load levels. However some weeks back we accepted a contribution which was an async servlet to be deployed on Tomcat as it was considered more "real world". The issues I noticed was when running high load levels over this servet deployed on Tomcat, especially when the response was being delayed to simulate realistic behavior. Although we do not Tomcat ourselves, our customers do. I am also not calling this a bug - but as an area for possible improvement. If the Tomcat users, developers and the PMC thinks this is worthwhile to pursue, I believe it would be a good enhancement - maybe even a good GSoc project. As a fellow member of the ASF and a committer on multiple projects/years, I believed it was my duty to bring this to the attention of the Tomcat community. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org