Return-Path: X-Original-To: apmail-trafficserver-users-archive@www.apache.org Delivered-To: apmail-trafficserver-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81E15FB2F for ; Thu, 21 Mar 2013 22:23:31 +0000 (UTC) Received: (qmail 23044 invoked by uid 500); 21 Mar 2013 22:23:31 -0000 Delivered-To: apmail-trafficserver-users-archive@trafficserver.apache.org Received: (qmail 23004 invoked by uid 500); 21 Mar 2013 22:23:31 -0000 Mailing-List: contact users-help@trafficserver.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@trafficserver.apache.org Delivered-To: mailing list users@trafficserver.apache.org Received: (qmail 22994 invoked by uid 99); 21 Mar 2013 22:23:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 22:23:31 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [176.9.94.134] (HELO mail.brainsware.org) (176.9.94.134) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 22:23:26 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.brainsware.org (Postfix) with ESMTP id 7C4CD21A50 for ; Thu, 21 Mar 2013 22:23:04 +0000 (UTC) X-Virus-Scanned: amavisd-new at brainsware.org Received: from mail.brainsware.org ([127.0.0.1]) by localhost (mail.brainsware.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id O0Z6glsNrHzC for ; Thu, 21 Mar 2013 22:23:00 +0000 (UTC) Received: from mail.brainsware.org (mail.brainsware.org [176.9.94.134]) by mail.brainsware.org (Postfix) with ESMTP id 79FC221A4E for ; Thu, 21 Mar 2013 22:23:00 +0000 (UTC) Date: Thu, 21 Mar 2013 22:23:00 +0000 (UTC) From: Igor =?utf-8?Q?Gali=C4=87?= To: users@trafficserver.apache.org Message-ID: <187551007.383708.1363904580283.JavaMail.root@brainsware.org> In-Reply-To: References: <208911362.382249.1363856263623.JavaMail.root@brainsware.org> <1E842388-69D4-4F51-831D-33CEAEA890C9@gmail.com> Subject: Re: ATS performs poorly proxying larger files MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_383707_898000989.1363904580281" X-Originating-IP: [91.130.91.61] X-Mailer: Zimbra 8.0.1_GA_5438 (ZimbraWebClient - GC25 (Linux)/8.0.1_GA_5438) Thread-Topic: ATS performs poorly proxying larger files Thread-Index: FJ1tgHuITU5cUZx9twI+yrQ+498Glw== X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_383707_898000989.1363904580281 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable This may be useful:=20 http://kerneltrap.org/mailarchive/linux-netdev/2010/4/15/6274814/thread=20 ----- Original Message ----- > Hi Yongming, > I haven't changed the networking configuraton but I've also noticed > that once the first core is at 100% utilization the server won't > answer all ping requests anymore and has packet loss. This might be > a sign that all network traffic is handled by the first core isn't > it? > You can find a screenshot of the threading output of top here: > http://i.imgur.com/X3te2Ru.png > Best Regards > Philip > 2013/3/21 Yongming Zhao < ming.zym@gmail.com > > > well, due to the high network traffic, have you make the 10Ge NIC > > irq > > balanced to multiple cpu? >=20 > > and can you show us the threading CPU usage in the top? >=20 > > thanks >=20 > > =E5=9C=A8 2013-3-21=EF=BC=8C=E4=B8=8B=E5=8D=887:42=EF=BC=8CPhilip < fli= ps01@gmail.com > =E5=86=99=E9=81=93=EF=BC=9A >=20 > > > I've just upgraded to ATS 3.3.1-dev. The problem still is the > > > same: > > > http://i.imgur.com/1pHWQy7.png > >=20 >=20 > > > The load goes on one core. (The server is only running ATS) > >=20 >=20 > > > 2013/3/21 Philip < flips01@gmail.com > > >=20 >=20 > > > > Hi Igor, > > >=20 > >=20 >=20 > > > > I am using ATS 3.2.4, Debian 6 (Squeeze) and a 3.2.13 Kernel. > > >=20 > >=20 >=20 > > > > I was using the "traffic_line -r" command to see the number of > > > > origin > > > > connections growing and htop/atop to see that only one core is > > > > 100% > > > > utilized. I've already tested the following changes to the > > > > configuration: > > >=20 > >=20 >=20 > > > > proxy.config.accept_threads -> 0 > > >=20 > >=20 >=20 > > > > proxy.config.exec_thread.autoconfig -> 0 > > >=20 > >=20 >=20 > > > > proxy.config.exec_thread.limit -> 120 > > >=20 > >=20 >=20 > > > > They had no effect there is still the one core that becomes > > > > 100% > > > > utilized and turns out to be a bottleneck. > > >=20 > >=20 >=20 > > > > Best Regards > > >=20 > >=20 >=20 > > > > Philip > > >=20 > >=20 >=20 > > > > 2013/3/21 Igor Gali=C4=87 < i.galic@brainsware.org > > > >=20 > >=20 >=20 > > > > > Hi Philip, > > > >=20 > > >=20 > >=20 >=20 > > > > > Let's start with some simple data mining: > > > >=20 > > >=20 > >=20 >=20 > > > > > which version of ATS are you running? > > > >=20 > > >=20 > >=20 >=20 > > > > > What OS/Distro/version are you running it on? > > > >=20 > > >=20 > >=20 >=20 > > > > > Are you looking at stats_over_http's output to determine > > > > > what's > > > > > going > > > > > on in ATS? > > > >=20 > > >=20 > >=20 >=20 > > > > > -- i > > > >=20 > > >=20 > >=20 >=20 > > > > > > I have noticed the following strange behavior: Once the > > > > > > number > > > > > > of > > > > > > origin connections start to increase and the proxying speed > > > > > > collapses the first core is at 100% utilization while the > > > > > > others > > > > > > are > > > > > > not even close to that. It seems like the origin requests > > > > > > are > > > > > > handled by the first core only. Is this expected behavior > > > > > > that > > > > > > can > > > > > > be changed by editing the configuration or is this a bug? > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > 2013/3/20 Philip < flips01@gmail.com > > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > Hi, > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > I am running ATS on a pretty large server with two > > > > > > > physical > > > > > > > 6 > > > > > > > core > > > > > > > XEON CPUs and 22 raw device disks. I want to use that > > > > > > > server > > > > > > > as > > > > > > > a > > > > > > > frontend for several fileservers. It is currently > > > > > > > configured > > > > > > > to > > > > > > > be > > > > > > > infront of two file-servers. The load on the ATS server > > > > > > > is > > > > > > > pretty > > > > > > > low. About 1-4% disk utilization and 500Mbps of outgoing > > > > > > > traffic. > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > Once I direct the traffic of the third file server > > > > > > > towards > > > > > > > ATS > > > > > > > something strange happens: > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > - The number of origin connection increases continually. > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > - Requests that hit ATS and are not cached are served > > > > > > > really > > > > > > > slow > > > > > > > to > > > > > > > the client (about 35 kB/s) while requests that are served > > > > > > > from > > > > > > > the > > > > > > > cache are blazingly fast. > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > The ATS server has a dedicated 10Gbps port that is not > > > > > > > maxed > > > > > > > out, > > > > > > > no > > > > > > > CPU core is maxxed, there is no swapping, there are no > > > > > > > error > > > > > > > logs > > > > > > > and also the origin servers are not heavy utilized. It > > > > > > > feels > > > > > > > like > > > > > > > there are not enough workers to process the origin > > > > > > > requests. > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > Is there anything I can do to check if my theory is right > > > > > > > and > > > > > > > a > > > > > > > way > > > > > > > to increase the number of origin workers? > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > Best Regards > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > > > Philip > > > > > >=20 > > > > >=20 > > > >=20 > > >=20 > >=20 >=20 > > > > > -- > > > >=20 > > >=20 > >=20 >=20 > > > > > Igor Gali=C4=87 > > > >=20 > > >=20 > >=20 >=20 > > > > > Tel: +43 (0) 664 886 22 883 > > > >=20 > > >=20 > >=20 >=20 > > > > > Mail: i.galic@brainsware.org > > > >=20 > > >=20 > >=20 >=20 > > > > > URL: http://brainsware.org/ > > > >=20 > > >=20 > >=20 >=20 > > > > > GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE > > > >=20 > > >=20 > >=20 >=20 --=20 Igor Gali=C4=87=20 Tel: +43 (0) 664 886 22 883=20 Mail: i.galic@brainsware.org=20 URL: http://brainsware.org/=20 GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE=20 ------=_Part_383707_898000989.1363904580281 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
This may be useful:



Hi Yongming,

I haven't cha= nged the networking configuraton but I've also noticed that once the first = core is at 100% utilization the server won't answer all ping requests anymo= re and has packet loss. This might be a sign that all network traffic is ha= ndled by the first core isn't it?

You can find a screenshot of the threading output of top here: http://i.imgur.com/X3= te2Ru.png

Best Regards
Philip

2013/3/21 Yongming Zhao <ming.zym@gmail.com&= gt;
well, due to the high network traf= fic, have you make the 10Ge NIC irq  balanced to multiple cpu?

and can you show us the threading CPU usage in the top?&n= bsp;

thanks

=E5=9C=A8 2013-3-21=EF=BC=8C=E4=B8= =8B=E5=8D=887:42=EF=BC=8CPhilip <flips01@gmail.com> =E5=86=99=E9=81=93=EF=BC=9A
=

I've just upgraded to ATS 3.3.1-dev. The problem = still is the same: http://i.imgur.com/1pHWQy7.png

The load goes on one core. (The server is only running ATS)
2013/3/21 Philip <flips01@gmail.com>
Hi Igor,

I am using ATS 3.2.4, Debian 6 (Squeeze) and a 3= .2.13 Kernel.

I was using the "traffic_line -r" command t= o see the number of origin connections growing and htop/atop to see that on= ly one core is 100% utilized. I've already tested the following changes to = the configuration:

proxy.config.accept_threads -> 0

proxy.config.exec= _thread.autoconfig -> 0
proxy.config.exec_thread.limit -> 120
<= div>
They had no effect there is still the one core that becomes 1= 00% utilized and turns out to be a bottleneck.

Best Regards
Philip

=
2013/3/21 Igor Gali=C4=87 <i.gal= ic@brainsware.org>
Hi Philip,

Let's start with some simple data mining: 
<= div>
which version of ATS are you running?
What OS/= Distro/version are you running it on?

Are you looking at stats_over_http's output to determin= e what's going on in ATS?

-- i


I have noticed the following strange behavior: Once the number of origin co= nnections start to increase and the proxying speed collapses the first core= is at 100% utilization while the others are not even close to that. It see= ms like the origin requests are handled by the first core only. Is this exp= ected behavior that can be changed by editing the configuration or is this = a bug?



2013/3/20 Philip <flips0= 1@gmail.com>
Hi,

I am running ATS on a pretty large server with two ph= ysical 6 core XEON CPUs and 22 raw device disks. I want to use that server = as a frontend for several fileservers. It is currently configured to be inf= ront of two file-servers. The load on the ATS server is pretty low. About 1= -4% disk utilization and 500Mbps of outgoing traffic.

Once I direct the traffic of the third file server towards ATS somethin= g strange happens:

- The number of origin connection incr= eases continually.=
- Requests that hit ATS and are not cached are served really slow to the cl= ient (about 35 kB/s) while requests that are served from the cache are blaz= ingly fast.

The ATS server has a dedicated 10Gbps port th= at is not maxed out, no CPU core is maxxed, there is no swapping, there are= no error logs and also the origin servers are not heavy utilized. It feels= like there are not enough workers to process the origin requests.

Is there anything I can do to check if my theory is right and a way to = increase the number of origin workers?

Best Regards=
Philip




--
Igor Gali=C4=87

= Tel: +43 (0) 664 886 22 883
Mail: i.galic@b= rainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 = 4B1D 9E08 A097 C9AE








--
Igor Gali=C4=87

Tel: +43 (0) 664 886 2= 2 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG= : 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE

------=_Part_383707_898000989.1363904580281--