Return-Path: X-Original-To: apmail-tomcat-users-archive@www.apache.org Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D990BD4C8 for ; Fri, 7 Sep 2012 18:01:27 +0000 (UTC) Received: (qmail 26400 invoked by uid 500); 7 Sep 2012 18:01:24 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 26351 invoked by uid 500); 7 Sep 2012 18:01:24 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 26340 invoked by uid 99); 7 Sep 2012 18:01:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Sep 2012 18:01:24 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of llowder@oreillyauto.com designates 208.70.182.171 as permitted sender) Received: from [208.70.182.171] (HELO domino1.oreillyauto.com) (208.70.182.171) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Sep 2012 18:01:19 +0000 MIME-Version: 1.0 In-Reply-To: Subject: Re: Tuning session replication on clusters To: "Tomcat Users List" Message-ID: From: llowder@oreillyauto.com Date: Fri, 7 Sep 2012 13:00:58 -0500 Content-Transfer-Encoding: quoted-printable Content-type: text/plain; charset=ISO-8859-1 References: <5048C319.2080709@christopherschultz.net><504923B6.5060107@christopherschultz.net> X-KeepSent: 3E9647F7:3F869D02-86257A72:00621EF9; name=$KeepSent; type=4 X-Mailer: Lotus Notes Release 8.5.1 September 28, 2009 X-Disclaimed: 14539 X-MIMETrack: CD-MIME by Router on domino1/OReilly(Release 8.5.3|September 15, 2011) at 09/07/2012 13:01:18,CD-MIME complete at 09/07/2012 13:01:18,Itemize by Router on domino1/OReilly(Release 8.5.3|September 15, 2011) at 09/07/2012 13:01:18 X-Virus-Checked: Checked by ClamAV on apache.org Shanti Suresh wrote on 09/07/2012 12:37:34 PM: > From: Shanti Suresh > To: Tomcat Users List > Date: 09/07/2012 12:44 PM > Subject: Re: Tuning session replication on clusters > > Hi Kyle, > > Great testing, btw. > > So when you say "x5", did you change the settings as follows: > rxBufAize=3D"125940" (=3D> 25188 x 5) I work with Kyle on this, and was the OP for this thread. When he says x5, what he did was set the rx/tx values to 5 times their original values, so yes, if the original value was 25188 then the x5 value would be 125940. We figured that using multiples of the original values would make for the simplest analysis and for more accurate comparisons. > > By any chance, have you analyzed a heapdump of Tomcat at periodic intervals > to see which class is hogging heap during the session replication? > > Thanks. > > -Shanti > > On Fri, Sep 7, 2012 at 12:19 PM, wrote: > > > Chris: > > >Assembling the sessions into a Collection is likely to be very fast, > > >since it's just copying references around: the size of the individual > > >sessions should not matter. Of course, pushing all those bytes to the > > >other servers... > > > > >Perhaps Tomcat does something like serialize the session to a big > > >binary structure and then sends that (which sounds insane -- streaming > > >binary data is how that should be done -- but I haven't checked to > > >code to be sure). > > > > It appears that tomcat is serializing all the data into a singular > > structure, rather than a collection of references. Watching VisualVM plot > > heap usage during replication, it nearly doubles (in my test env, this was > > the only thing running so that makes sense). If you're sure Tomcat is only > > making references, then I'd propose there is a problem with the JVM > > dereferencing the collection elements and double-counting the memory used. > > Either way, it's enough to make the JVM report a doubling of heap usage and > > a raise to the heap allocation. As soon as replication is done, heap use > > goes back to normal. I've attached a screenshot to the zip file. > > > > > > Now for data: > > I did tests of 200 sessions (~20 MB) at a time (200, 400, 600... up to > > 3000). I then tested in groups of 1000 (3000, 4000, 5000... up to 10k). > > At no point did I receive any exceptions or OOME issues. Heap usage never > > climbed above 60% Xmx. My lab was isolated to help give consistent > > results. Here are some points. > > > > 1. There is a pivotal point where replication performance degrades > > dramatically. In my tests, this happened around 2400-2600 sessions. I > > restarted tomcat and was able to avoid the issue, until I hit 2800 sessions > > (~300 MB total session data). There was a 153% jump in time required to > > perform replication at this point. From there, each subsequent test took > > marginally longer per session (15-25%) than the test before it. Chris was > > correct, it's not exponential, but the ms/session gets worse and worse as > > we climb. I have no explanation for the sharp jump or the continued > > degradation as we climb. I've seem similar performance issues with sort > > and comparative logic, but those don't make sense here. Perhaps this > > serialized object is being jerked around Young Gen/Old Gen and having to be > > constantly reallocated? Grasping at straws here... > > > > 2. Networking is a large portion of the bottleneck for large data sets. > > The thread size and pool size attributes to the sender/receiver had no > > impact on throughput. Also, a packet capture revealed nothing naughty > > happening. However, the rxBufSize and txBufSize values on the Nio receiver > > and the PooledParallel transport elements made a profound difference. I > > generated 7000 sessions (~700MB) and used default settings: 74 sec. > > Increasing the rx/tx settings by x5 I was able to replicate the sessions in > > 33 sec. Gains beyond x5 were almost nil; at x100 (which is absurd) only > > resulted in 29.3 sec replication. > > A simple SCP transfer of a 700 MB file (using tmpfs folders) between these > > same two systems took 13 seconds. > > > > My conclusion is that tuning the network was obviously a great help, but it > > still took 30 seconds to replicate 700MB worth of session data on a network > > with enough throughput to perform the transfer in 13 seconds. I don't know > > if further network settings could be changed for the DeltaManager to aid in > > speeding up replication, but given the spike in memory use and the pivotal > > performance drop at a consistent point I'm inclined to think we're hitting > > some edge case regarding session size and memory settings (Xmx/Heap and > > NewSize/SurvivorRatio). As Chris said, if Tomcat isn't collecting just > > references, it probably should be. > > > > Feel free to pick apart my data or thoughts. I tried to be as analytical > > as possible, but there's a lot of conjecture in here. > > > > Attachment > > (See attached file: SessionResearch.zip) > > If the list strips it, find copy here: > > https://docs.google.com/open?id=3D0B876X8DOwh8peEkyZVd6RVc4cWc > > > > Thanks. > > > > Kyle Harper > > > > > > > > > > > > > > > > > > This communication and any attachments are confidential, protected by > > Communications Privacy Act 18 USCS =A7 2510, solely for the use of the > > intended recipient, and may contain legally privileged material. If you are > > not the intended recipient, please return or destroy it immediately. Thank > > you. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org > > For additional commands, e-mail: users-help@tomcat.apache.org > > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > This communication and any attachments are confidential, protected by Commu= nications Privacy Act 18 USCS =A7 2510, solely for the use of the intended = recipient, and may contain legally privileged material. If you are not the = intended recipient, please return or destroy it immediately. Thank you. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org