well, due to the high network traffic, have you make the 10Ge NIC irq balanced to multiple cpu?
and can you show us the threading CPU usage in the top?
I've just upgraded to ATS 3.3.1-dev. The problem still is the same: http://i.imgur.com/1pHWQy7.png
The load goes on one core. (The server is only running ATS)
2013/3/21 Philip <email@example.com>
I am using ATS 3.2.4, Debian 6 (Squeeze) and a 3.2.13 Kernel.
I was using the "traffic_line -r" command to see the number of origin connections growing and htop/atop to see that only one core is 100% utilized. I've already tested the following changes to the configuration:
proxy.config.accept_threads -> 0
proxy.config.exec_thread.autoconfig -> 0
proxy.config.exec_thread.limit -> 120
They had no effect there is still the one core that becomes 100% utilized and turns out to be a bottleneck.
2013/3/21 Igor Galić <firstname.lastname@example.org>
Let's start with some simple data mining:
which version of ATS are you running?
What OS/Distro/version are you running it on?
Are you looking at stats_over_http's output to determine what's going on in ATS?
I have noticed the following strange behavior: Once the number of origin connections start to increase and the proxying speed collapses the first core is at 100% utilization while the others are not even close to that. It seems like the origin requests are handled by the first core only. Is this expected behavior that can be changed by editing the configuration or is this a bug?
2013/3/20 Philip <email@example.com>
I am running ATS on a pretty large server with two physical 6 core XEON CPUs and 22 raw device disks. I want to use that server as a frontend for several fileservers. It is currently configured to be infront of two file-servers. The load on the ATS server is pretty low. About 1-4% disk utilization and 500Mbps of outgoing traffic.
Once I direct the traffic of the third file server towards ATS something strange happens:
- The number of origin connection increases continually.
- Requests that hit ATS and are not cached are served really slow to the client (about 35 kB/s) while requests that are served from the cache are blazingly fast.
The ATS server has a dedicated 10Gbps port that is not maxed out, no CPU core is maxxed, there is no swapping, there are no error logs and also the origin servers are not heavy utilized. It feels like there are not enough workers to process the origin requests.
Is there anything I can do to check if my theory is right and a way to increase the number of origin workers?