Return-Path: Delivered-To: apmail-geronimo-user-archive@www.apache.org Received: (qmail 6458 invoked from network); 10 Aug 2008 17:06:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Aug 2008 17:06:24 -0000 Received: (qmail 20265 invoked by uid 500); 10 Aug 2008 17:06:22 -0000 Delivered-To: apmail-geronimo-user-archive@geronimo.apache.org Received: (qmail 20254 invoked by uid 500); 10 Aug 2008 17:06:22 -0000 Mailing-List: contact user-help@geronimo.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: user@geronimo.apache.org List-Id: Delivered-To: mailing list user@geronimo.apache.org Received: (qmail 20243 invoked by uid 99); 10 Aug 2008 17:06:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Aug 2008 10:06:22 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kevan.miller@gmail.com designates 74.125.44.30 as permitted sender) Received: from [74.125.44.30] (HELO yx-out-2324.google.com) (74.125.44.30) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Aug 2008 17:05:23 +0000 Received: by yx-out-2324.google.com with SMTP id 3so504140yxj.85 for ; Sun, 10 Aug 2008 10:05:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:mime-version:subject:date:references :x-mailer; bh=sw9EpdM8lnl2vIsRdoB+jbo+zCgdtkST2NFesIujOpw=; b=vIMh7GCx9jQbB9W/rZi0dlU1IA+xmacSE2/8YHwZSV6JR6kqk29fpLVe8eyk5rRutI 1OwXampMRBbFJoHfFik7RS/GP8AkCbALdYn+wb2AHN/Rp3sEehgy/NJvtCfbcK1IoIEJ ZSdwa1KgCyAsRKMnT6redkOJLG39b+wEQqJTE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type:mime-version:subject :date:references:x-mailer; b=Hg43jdRXcj6ZEpAPNVmF7o1nF5KSbp0n/BfvjrLY8T6caqVz62CrkCEaArhEAdcol+ QtG/sLPGPlN6MJOg6xyRJZIUdlqjKHiuj1H5OxdHRKTV42R9X+K/xC8lXa0T8w/1r1AJ mAYDcdTDkIV9vVdRHZ0jOgR/YOPV6w5dttkl8= Received: by 10.150.206.11 with SMTP id d11mr7989047ybg.140.1218387933517; Sun, 10 Aug 2008 10:05:33 -0700 (PDT) Received: from ?10.0.1.185? ( [65.190.205.55]) by mx.google.com with ESMTPS id 7sm2741910ywo.7.2008.08.10.10.05.32 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 10 Aug 2008 10:05:32 -0700 (PDT) Message-Id: <181D9AEB-C89F-4EE5-9E77-A1A0701FE5A6@gmail.com> From: Kevan Miller To: user@geronimo.apache.org In-Reply-To: Content-Type: multipart/alternative; boundary=Apple-Mail-47--437436687 Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: large-scale app.. random spiking on even static files! Date: Sun, 10 Aug 2008 13:05:31 -0400 References: <18902003.post@talk.nabble.com> <18902031.post@talk.nabble.com> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-47--437436687 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit On Aug 9, 2008, at 12:09 AM, Pete Clark wrote: > Finally, here are our startup opts as it pertains to GC if it helps: > > -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=256m -XX:MaxPermSize=256m > -Xmx5120m -Xms5120m -Xss128k -XX:ParallelGCThreads=20 > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=8 > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 > -XX:+AggressiveOpts -Xloggc:/var/tmp/gc.log Have you attempted to correlate long response times with Garbage Collection activity? That would be my initial suspicion. I confess that I'm not up-to-date on the latest GC parameters. So, without a bit of reading on my part, I'm not entirely sure of the effect of your tuning parameters. Does your application actually require a 5 gig heap? What's your CPU utilization look like during periods of slow response time? IIUC correctly, the below results are for retrieval of a static text file. However, there is additional activity on the server (retrieval of Shockwave Flash files and a URL proxy). Is that correct? Is it feasible to isolate these functions? --kevan > > > On Fri, Aug 8, 2008 at 11:59 PM, pc3 wrote: >> >> Oh yeah, and we're using tomcat as the servlet engine.. thanks! >> >> >> pc3 wrote: >>> >>> hey all - >>> >>> so i've spent the past two weeks trying to figure out why my large >>> scale >>> application (200m page views per month) is having some serious, yet >>> random, spikes in response times. >>> >>> we've got an app running under gero 1.1.1, using hibernate, mysql, >>> etc. >>> i've got the response times down in general by adding memcaching >>> and other >>> optimizations but we still see the spikes.. e.g. most of our actions >>> return in 0.1 sec, but occasionally (and often enough!) .. we'll >>> see a >>> random spike up to 30s from different machines in our cluster. >>> >>> so for kicks tonight i tried running a test against all of our app >>> servers >>> not for an action but for a plain .txt file ... and guess what... >>> spiky >>> responses just from the txt file! a 4.9kb text file! look: >>> >>> $profileUrl /blah/Languages.txt >>> Host 192.168.1.100 >>> Time taken for tests: 8.555825 seconds <<<<<<<<<<<< >>> Host 192.168.1.101 >>> Time taken for tests: 0.2064 seconds >>> Host 192.168.1.108 >>> Time taken for tests: 0.1436 seconds >>> Host 192.168.1.104 >>> Time taken for tests: 0.1667 seconds >>> Host 192.168.1.112 >>> Time taken for tests: 0.1444 seconds >>> Host 192.168.1.117 >>> Time taken for tests: 0.4575 seconds >>> Host 192.168.1.118 >>> Time taken for tests: 0.2015 seconds >>> Host 192.168.1.119 >>> Time taken for tests: 0.2003 seconds >>> Host 192.168.1.120 >>> Time taken for tests: 0.1713 seconds >>> Host 192.168.1.121 >>> Time taken for tests: 7.22861 seconds <<<<<<<<<<<< >>> Host 192.168.1.122 >>> Time taken for tests: 0.1615 seconds >>> >>> Here are some things to go on: >>> 1) we still host a few swfs from these servers that are hit >>> frequently. >>> could these be causing some kind of blocking? >>> 2) we have a method in our application that acts as a proxy, >>> taking a url >>> as a parameter, fetching the url then dumping it to the response >>> >>> I'm open to any ideas or suggestions as to what could be causing the >>> spikiness of response times here for a txt file.. one that never >>> even hits >>> our java code. >>> >>> Thanks very much all... >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/large-scale-app..-random-spiking-on-even-static-files%21-tp18902003s134p18902031.html >> Sent from the Apache Geronimo - Users mailing list archive at >> Nabble.com. >> >> --Apple-Mail-47--437436687 Content-Type: text/html; charset=US-ASCII Content-Transfer-Encoding: quoted-printable
On Aug 9, 2008, = at 12:09 AM, Pete Clark wrote:

Finally, here are our startup opts as it pertains to = GC if it helps:

-XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=3D256m= -XX:MaxPermSize=3D256m
-Xmx5120m -Xms5120m -Xss128k = -XX:ParallelGCThreads=3D20
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC = -XX:SurvivorRatio=3D8
-XX:TargetSurvivorRatio=3D90 = -XX:MaxTenuringThreshold=3D31
-XX:+AggressiveOpts = -Xloggc:/var/tmp/gc.log

Have you = attempted to correlate long response times with Garbage Collection = activity? That would be my initial suspicion. I confess that I'm not = up-to-date on the latest GC parameters. So, without a bit of reading on = my part, I'm not entirely sure of the effect of your tuning = parameters.

Does your application actually = require a 5 gig heap? 

What's your CPU = utilization look like during periods of slow response = time?

IIUC correctly, the below results are for = retrieval of a static text file. However, there is additional activity = on the server (retrieval of Shockwave Flash files and a URL proxy). Is = that correct? Is it feasible to isolate these = functions?

--kevan



On Fri, Aug 8, 2008 at 11:59 PM, pc3 <peter.r.clark@gmail.com> = wrote:

Oh yeah, and we're using tomcat as the servlet engine.. = thanks!


pc3 = wrote:

hey all = -

so i've spent the past two weeks = trying to figure out why my large = scale
application (200m page views per month) is having some = serious, yet
random, spikes in response = times.

we've got an app running under = gero 1.1.1, using hibernate, mysql, = etc.
i've got the response times down in general by adding = memcaching and other
optimizations but we still see = the spikes.. e.g. most of our = actions
return in 0.1 sec, but occasionally (and often enough!) = .. we'll see a
random spike up to 30s from = different machines in our = cluster.

so for kicks tonight i tried = running a test against all of our app = servers
not for an action but for a plain .txt file ... and guess = what... spiky
responses just from the txt = file!  a 4.9kb text file! =  look:

$profileUrl = /blah/Languages.txt
Host = 192.168.1.100
Time taken for tests: =   8.555825 seconds = <<<<<<<<<<<<
Host = 192.168.1.101
Time taken for tests: =   0.2064 seconds
Host = 192.168.1.108
Time taken for tests: =   0.1436 seconds
Host = 192.168.1.104
Time taken for tests: =   0.1667 seconds
Host = 192.168.1.112
Time taken for tests: =   0.1444 seconds
Host = 192.168.1.117
Time taken for tests: =   0.4575 seconds
Host = 192.168.1.118
Time taken for tests: =   0.2015 seconds
Host = 192.168.1.119
Time taken for tests: =   0.2003 seconds
Host = 192.168.1.120
Time taken for tests: =   0.1713 seconds
Host = 192.168.1.121
Time taken for tests: =   7.22861 seconds = <<<<<<<<<<<<
Host = 192.168.1.122
Time taken for tests: =   0.1615 seconds

Here are some things to go = on:
1) we still host a few swfs from these servers that are = hit frequently.
could these be causing some kind = of blocking?
2) we have a method in our = application that acts as a proxy, taking a = url
as a parameter, fetching the url then dumping it to the = response

I'm open to any ideas or = suggestions as to what could be causing = the
spikiness of response times here for a txt file.. one that = never even hits
our java = code.

Thanks very much = all...



--
View this = message in context: http://www.nabble.com/large-sca= le-app..-random-spiking-on-even-static-files%21-tp18902003s134p18902031.ht= ml
Sent from the Apache = Geronimo - Users mailing list archive at = Nabble.com.



= --Apple-Mail-47--437436687--