geronimo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pc3 <peter.r.cl...@gmail.com>
Subject Re: large-scale app.. random spiking on even static files!
Date Tue, 12 Aug 2008 18:21:31 GMT

How does this look in terms of my GC settings? 

Note the Full GC taking 7-9 sec... and yes, our users are feeling it.  

9294.839: [Full GC [PSYoungGen: 5669K->0K(518848K)] [PSOldGen:
2270713K->2036966K(3495296K)] 2276382K->2036966K(4014144K) [PSPermGen:
132633K->132553K(262144K)], 9.6645590 secs]
9309.878: [GC [PSYoungGen: 513152K->8669K(530752K)]
2550118K->2045635K(4026048K), 0.0446680 secs]
9320.038: [GC [PSYoungGen: 518557K->9475K(516224K)]
2555523K->2054068K(4011520K), 0.0609930 secs]
9334.888: [GC [PSYoungGen: 516163K->24115K(521664K)]
2560756K->2083565K(4016960K), 0.0940340 secs]
9343.975: [GC [PSYoungGen: 521651K->33271K(527744K)]
2581101K->2132460K(4023040K), 0.1501200 secs]
9355.803: [GC [PSYoungGen: 527735K->9919K(538176K)]
2626924K->2141656K(4033472K), 0.1184410 secs]
9364.520: [GC [PSYoungGen: 297949K->4925K(496384K)]
2429687K->2145793K(3991680K), 0.0595340 secs]
9364.580: [Full GC [PSYoungGen: 4925K->0K(496384K)] [PSOldGen:
2140868K->2145631K(3495296K)] 2145793K->2145631K(3991680K) [PSPermGen:
132572K->132572K(262144K)], 7.3085600 secs]
9377.550: [GC [PSYoungGen: 491456K->7979K(532288K)]
2637087K->2153610K(4027584K), 0.0367470 secs]
9388.467: [GC [PSYoungGen: 496427K->10198K(495744K)]
2642058K->2162894K(3991040K), 0.0603230 secs]
9398.530: [GC [PSYoungGen: 495702K->9133K(521984K)]
2648398K->2171525K(4017280K), 0.0653030 secs]
9410.113: [GC [PSYoungGen: 491693K->8348K(488064K)]
2654085K->2178941K(3983360K), 0.0645200 secs]
9419.784: [GC [PSYoungGen: 488028K->9976K(512256K)]
2658621K->2187915K(4007552K), 0.0631060 secs]
9429.442: [GC [PSYoungGen: 486727K->8985K(483008K)]
2664666K->2196513K(3978304K), 0.0663820 secs]
9431.905: [GC [PSYoungGen: 117391K->2056K(511424K)]
2304920K->2198083K(4006720K), 0.0544540 secs]
9431.960: [Full GC [PSYoungGen: 2056K->0K(511424K)] [PSOldGen:
2196026K->2197951K(3495296K)] 2198083K->2197951K(4006720K) [PSPermGen:
132589K->132589K(262144K)], 7.4168870 secs]
9446.836: [GC [PSYoungGen: 473984K->9149K(480320K)]
2671935K->2207101K(3975616K), 0.0589960 secs]
9455.282: [GC [PSYoungGen: 480317K->8795K(497856K)]
2678269K->2215117K(3993152K), 0.0715080 secs]
9461.453: [GC [PSYoungGen: 477211K->8833K(474560K)]
2683533K->2223101K(3969856K), 0.0462470 secs]
9470.165: [GC [PSYoungGen: 474497K->8743K(489216K)]
2688765K->2231179K(3984512K), 0.0698200 secs]
9480.552: [GC [PSYoungGen: 471719K->8648K(468992K)]
2694155K->2239227K(3964288K), 0.0698500 secs]
9493.930: [GC [PSYoungGen: 468936K->7773K(481536K)]
2699515K->2245452K(3976832K), 0.0966450 secs]
9499.391: [GC [PSYoungGen: 150822K->3396K(461120K)]
2388502K->2248495K(3956416K), 0.0476100 secs]
9499.438: [Full GC [PSYoungGen: 3396K->0K(461120K)] [PSOldGen:
2245099K->2225315K(3495296K)] 2248495K->2225315K(3956416K) [PSPermGen:
132614K->132614K(262144K)], 6.7599780 secs]




Kevan Miller wrote:
> 
> 
> On Aug 9, 2008, at 12:09 AM, Pete Clark wrote:
> 
>> Finally, here are our startup opts as it pertains to GC if it helps:
>>
>> -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=256m -XX:MaxPermSize=256m
>> -Xmx5120m -Xms5120m -Xss128k -XX:ParallelGCThreads=20
>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=8
>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31
>> -XX:+AggressiveOpts -Xloggc:/var/tmp/gc.log
> 
> Have you attempted to correlate long response times with Garbage  
> Collection activity? That would be my initial suspicion. I confess  
> that I'm not up-to-date on the latest GC parameters. So, without a bit  
> of reading on my part, I'm not entirely sure of the effect of your  
> tuning parameters.
> 
> Does your application actually require a 5 gig heap?
> 
> What's your CPU utilization look like during periods of slow response  
> time?
> 
> IIUC correctly, the below results are for retrieval of a static text  
> file. However, there is additional activity on the server (retrieval  
> of Shockwave Flash files and a URL proxy). Is that correct? Is it  
> feasible to isolate these functions?
> 
> --kevan
> 
>>
>>
>> On Fri, Aug 8, 2008 at 11:59 PM, pc3 <peter.r.clark@gmail.com> wrote:
>>>
>>> Oh yeah, and we're using tomcat as the servlet engine.. thanks!
>>>
>>>
>>> pc3 wrote:
>>>>
>>>> hey all -
>>>>
>>>> so i've spent the past two weeks trying to figure out why my large  
>>>> scale
>>>> application (200m page views per month) is having some serious, yet
>>>> random, spikes in response times.
>>>>
>>>> we've got an app running under gero 1.1.1, using hibernate, mysql,  
>>>> etc.
>>>> i've got the response times down in general by adding memcaching  
>>>> and other
>>>> optimizations but we still see the spikes.. e.g. most of our actions
>>>> return in 0.1 sec, but occasionally (and often enough!) .. we'll  
>>>> see a
>>>> random spike up to 30s from different machines in our cluster.
>>>>
>>>> so for kicks tonight i tried running a test against all of our app  
>>>> servers
>>>> not for an action but for a plain .txt file ... and guess what...  
>>>> spiky
>>>> responses just from the txt file!  a 4.9kb text file!  look:
>>>>
>>>> $profileUrl /blah/Languages.txt
>>>> Host 192.168.1.100
>>>> Time taken for tests:   8.555825 seconds <<<<<<<<<<<<
>>>> Host 192.168.1.101
>>>> Time taken for tests:   0.2064 seconds
>>>> Host 192.168.1.108
>>>> Time taken for tests:   0.1436 seconds
>>>> Host 192.168.1.104
>>>> Time taken for tests:   0.1667 seconds
>>>> Host 192.168.1.112
>>>> Time taken for tests:   0.1444 seconds
>>>> Host 192.168.1.117
>>>> Time taken for tests:   0.4575 seconds
>>>> Host 192.168.1.118
>>>> Time taken for tests:   0.2015 seconds
>>>> Host 192.168.1.119
>>>> Time taken for tests:   0.2003 seconds
>>>> Host 192.168.1.120
>>>> Time taken for tests:   0.1713 seconds
>>>> Host 192.168.1.121
>>>> Time taken for tests:   7.22861 seconds <<<<<<<<<<<<
>>>> Host 192.168.1.122
>>>> Time taken for tests:   0.1615 seconds
>>>>
>>>> Here are some things to go on:
>>>> 1) we still host a few swfs from these servers that are hit  
>>>> frequently.
>>>> could these be causing some kind of blocking?
>>>> 2) we have a method in our application that acts as a proxy,  
>>>> taking a url
>>>> as a parameter, fetching the url then dumping it to the response
>>>>
>>>> I'm open to any ideas or suggestions as to what could be causing the
>>>> spikiness of response times here for a txt file.. one that never  
>>>> even hits
>>>> our java code.
>>>>
>>>> Thanks very much all...
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/large-scale-app..-random-spiking-on-even-static-files%21-tp18902003s134p18902031.html
>>> Sent from the Apache Geronimo - Users mailing list archive at  
>>> Nabble.com.
>>>
>>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/large-scale-app..-random-spiking-on-even-static-files%21-tp18902003s134p18948990.html
Sent from the Apache Geronimo - Users mailing list archive at Nabble.com.


Mime
View raw message