Return-Path: Delivered-To: apmail-geronimo-user-archive@www.apache.org Received: (qmail 85812 invoked from network); 12 Aug 2008 18:22:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Aug 2008 18:22:03 -0000 Received: (qmail 52083 invoked by uid 500); 12 Aug 2008 18:22:00 -0000 Delivered-To: apmail-geronimo-user-archive@geronimo.apache.org Received: (qmail 52073 invoked by uid 500); 12 Aug 2008 18:22:00 -0000 Mailing-List: contact user-help@geronimo.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: user@geronimo.apache.org List-Id: Delivered-To: mailing list user@geronimo.apache.org Received: (qmail 52058 invoked by uid 99); 12 Aug 2008 18:22:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Aug 2008 11:22:00 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Aug 2008 18:21:03 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1KSyUt-0007lP-3w for user@geronimo.apache.org; Tue, 12 Aug 2008 11:21:31 -0700 Message-ID: <18948990.post@talk.nabble.com> Date: Tue, 12 Aug 2008 11:21:31 -0700 (PDT) From: pc3 To: user@geronimo.apache.org Subject: Re: large-scale app.. random spiking on even static files! In-Reply-To: <181D9AEB-C89F-4EE5-9E77-A1A0701FE5A6@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: peter.r.clark@gmail.com References: <18902003.post@talk.nabble.com> <18902031.post@talk.nabble.com> <181D9AEB-C89F-4EE5-9E77-A1A0701FE5A6@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org How does this look in terms of my GC settings? Note the Full GC taking 7-9 sec... and yes, our users are feeling it. 9294.839: [Full GC [PSYoungGen: 5669K->0K(518848K)] [PSOldGen: 2270713K->2036966K(3495296K)] 2276382K->2036966K(4014144K) [PSPermGen: 132633K->132553K(262144K)], 9.6645590 secs] 9309.878: [GC [PSYoungGen: 513152K->8669K(530752K)] 2550118K->2045635K(4026048K), 0.0446680 secs] 9320.038: [GC [PSYoungGen: 518557K->9475K(516224K)] 2555523K->2054068K(4011520K), 0.0609930 secs] 9334.888: [GC [PSYoungGen: 516163K->24115K(521664K)] 2560756K->2083565K(4016960K), 0.0940340 secs] 9343.975: [GC [PSYoungGen: 521651K->33271K(527744K)] 2581101K->2132460K(4023040K), 0.1501200 secs] 9355.803: [GC [PSYoungGen: 527735K->9919K(538176K)] 2626924K->2141656K(4033472K), 0.1184410 secs] 9364.520: [GC [PSYoungGen: 297949K->4925K(496384K)] 2429687K->2145793K(3991680K), 0.0595340 secs] 9364.580: [Full GC [PSYoungGen: 4925K->0K(496384K)] [PSOldGen: 2140868K->2145631K(3495296K)] 2145793K->2145631K(3991680K) [PSPermGen: 132572K->132572K(262144K)], 7.3085600 secs] 9377.550: [GC [PSYoungGen: 491456K->7979K(532288K)] 2637087K->2153610K(4027584K), 0.0367470 secs] 9388.467: [GC [PSYoungGen: 496427K->10198K(495744K)] 2642058K->2162894K(3991040K), 0.0603230 secs] 9398.530: [GC [PSYoungGen: 495702K->9133K(521984K)] 2648398K->2171525K(4017280K), 0.0653030 secs] 9410.113: [GC [PSYoungGen: 491693K->8348K(488064K)] 2654085K->2178941K(3983360K), 0.0645200 secs] 9419.784: [GC [PSYoungGen: 488028K->9976K(512256K)] 2658621K->2187915K(4007552K), 0.0631060 secs] 9429.442: [GC [PSYoungGen: 486727K->8985K(483008K)] 2664666K->2196513K(3978304K), 0.0663820 secs] 9431.905: [GC [PSYoungGen: 117391K->2056K(511424K)] 2304920K->2198083K(4006720K), 0.0544540 secs] 9431.960: [Full GC [PSYoungGen: 2056K->0K(511424K)] [PSOldGen: 2196026K->2197951K(3495296K)] 2198083K->2197951K(4006720K) [PSPermGen: 132589K->132589K(262144K)], 7.4168870 secs] 9446.836: [GC [PSYoungGen: 473984K->9149K(480320K)] 2671935K->2207101K(3975616K), 0.0589960 secs] 9455.282: [GC [PSYoungGen: 480317K->8795K(497856K)] 2678269K->2215117K(3993152K), 0.0715080 secs] 9461.453: [GC [PSYoungGen: 477211K->8833K(474560K)] 2683533K->2223101K(3969856K), 0.0462470 secs] 9470.165: [GC [PSYoungGen: 474497K->8743K(489216K)] 2688765K->2231179K(3984512K), 0.0698200 secs] 9480.552: [GC [PSYoungGen: 471719K->8648K(468992K)] 2694155K->2239227K(3964288K), 0.0698500 secs] 9493.930: [GC [PSYoungGen: 468936K->7773K(481536K)] 2699515K->2245452K(3976832K), 0.0966450 secs] 9499.391: [GC [PSYoungGen: 150822K->3396K(461120K)] 2388502K->2248495K(3956416K), 0.0476100 secs] 9499.438: [Full GC [PSYoungGen: 3396K->0K(461120K)] [PSOldGen: 2245099K->2225315K(3495296K)] 2248495K->2225315K(3956416K) [PSPermGen: 132614K->132614K(262144K)], 6.7599780 secs] Kevan Miller wrote: > > > On Aug 9, 2008, at 12:09 AM, Pete Clark wrote: > >> Finally, here are our startup opts as it pertains to GC if it helps: >> >> -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=256m -XX:MaxPermSize=256m >> -Xmx5120m -Xms5120m -Xss128k -XX:ParallelGCThreads=20 >> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=8 >> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 >> -XX:+AggressiveOpts -Xloggc:/var/tmp/gc.log > > Have you attempted to correlate long response times with Garbage > Collection activity? That would be my initial suspicion. I confess > that I'm not up-to-date on the latest GC parameters. So, without a bit > of reading on my part, I'm not entirely sure of the effect of your > tuning parameters. > > Does your application actually require a 5 gig heap? > > What's your CPU utilization look like during periods of slow response > time? > > IIUC correctly, the below results are for retrieval of a static text > file. However, there is additional activity on the server (retrieval > of Shockwave Flash files and a URL proxy). Is that correct? Is it > feasible to isolate these functions? > > --kevan > >> >> >> On Fri, Aug 8, 2008 at 11:59 PM, pc3 wrote: >>> >>> Oh yeah, and we're using tomcat as the servlet engine.. thanks! >>> >>> >>> pc3 wrote: >>>> >>>> hey all - >>>> >>>> so i've spent the past two weeks trying to figure out why my large >>>> scale >>>> application (200m page views per month) is having some serious, yet >>>> random, spikes in response times. >>>> >>>> we've got an app running under gero 1.1.1, using hibernate, mysql, >>>> etc. >>>> i've got the response times down in general by adding memcaching >>>> and other >>>> optimizations but we still see the spikes.. e.g. most of our actions >>>> return in 0.1 sec, but occasionally (and often enough!) .. we'll >>>> see a >>>> random spike up to 30s from different machines in our cluster. >>>> >>>> so for kicks tonight i tried running a test against all of our app >>>> servers >>>> not for an action but for a plain .txt file ... and guess what... >>>> spiky >>>> responses just from the txt file! a 4.9kb text file! look: >>>> >>>> $profileUrl /blah/Languages.txt >>>> Host 192.168.1.100 >>>> Time taken for tests: 8.555825 seconds <<<<<<<<<<<< >>>> Host 192.168.1.101 >>>> Time taken for tests: 0.2064 seconds >>>> Host 192.168.1.108 >>>> Time taken for tests: 0.1436 seconds >>>> Host 192.168.1.104 >>>> Time taken for tests: 0.1667 seconds >>>> Host 192.168.1.112 >>>> Time taken for tests: 0.1444 seconds >>>> Host 192.168.1.117 >>>> Time taken for tests: 0.4575 seconds >>>> Host 192.168.1.118 >>>> Time taken for tests: 0.2015 seconds >>>> Host 192.168.1.119 >>>> Time taken for tests: 0.2003 seconds >>>> Host 192.168.1.120 >>>> Time taken for tests: 0.1713 seconds >>>> Host 192.168.1.121 >>>> Time taken for tests: 7.22861 seconds <<<<<<<<<<<< >>>> Host 192.168.1.122 >>>> Time taken for tests: 0.1615 seconds >>>> >>>> Here are some things to go on: >>>> 1) we still host a few swfs from these servers that are hit >>>> frequently. >>>> could these be causing some kind of blocking? >>>> 2) we have a method in our application that acts as a proxy, >>>> taking a url >>>> as a parameter, fetching the url then dumping it to the response >>>> >>>> I'm open to any ideas or suggestions as to what could be causing the >>>> spikiness of response times here for a txt file.. one that never >>>> even hits >>>> our java code. >>>> >>>> Thanks very much all... >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/large-scale-app..-random-spiking-on-even-static-files%21-tp18902003s134p18902031.html >>> Sent from the Apache Geronimo - Users mailing list archive at >>> Nabble.com. >>> >>> > > > -- View this message in context: http://www.nabble.com/large-scale-app..-random-spiking-on-even-static-files%21-tp18902003s134p18948990.html Sent from the Apache Geronimo - Users mailing list archive at Nabble.com.