httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lu, Yingqi" <yingqi...@intel.com>
Subject RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
Date Mon, 10 Feb 2014 19:08:43 GMT
Hi All,

I just want to ping again on this patch to see if there are any feedback and comments. This
is our first patch to the Apache community. Please let us know if there is anything we can
do to help you test and comment the patch.

Thanks,
Yingqi

From: Lu, Yingqi
Sent: Friday, January 24, 2014 3:26 PM
To: dev@httpd.apache.org<mailto:dev@httpd.apache.org>
Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Dear All,

Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series
systems, using an open source three tier social networking web server workload, revealed performance
scaling issues.  In current software single listen statement (listen 80) provides better scalability
due to un-serialized accept. However, when system is under very high load, this can lead to
big number of child processes stuck in D state. On the other hand, the serialized accept approach
cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen
statements specified, could scale to just 70% utilization, and a 64-thread system, with signal
listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.

Based on those findings, we created a prototype patch for prefork mpm which extends performance
and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature
allows multiple sockets listen to the same IP:port and automatically round robins connections.
We use this feature to create multiple duplicated listener records of the original one and
partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old
kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement
case" by creating 1 listen record for each listen statement and partitioning the child processes
into different buckets. Each bucket listens to 1 IP:port.

Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase
with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel
(Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single
listen statement (listen 80) configuration, we observed over 2X performance improvements on
modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in
response time, in addition to the throughput improvement gained in our tests 1.

Following the feedback from the bugzilla website where we originally submitted the patch,
we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff
Trawick for his good suggestion! We are also actively working on extending the patch to worker
and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you
on the current prefork patch. Please take some time test it and let us know how it works in
your environment.

This is our first patch to the Apache community. Please help us review it and let us know
if there is anything we might revise to improve it. Your feedback is very much appreciated.

Configuration:
<IfModule prefork.c>
    ListenBacklog 105384
    ServerLimit 105000
    MaxClients 1024
    MaxRequestsPerChild 0
    StartServers 64
    MinSpareServers 8
    MaxSpareServers 16
</IfModule>

1. Software and workloads used in performance tests may have been optimized for performance
only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured
using specific computer systems, components, software, operations and functions. Any change
to any of those factors may cause the results to vary. You should consult other information
and performance tests to assist you in fully evaluating your contemplated purchases, including
the performance of that product when combined with other products.

Thanks,
Yingqi

Mime
View raw message