httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Ames <grega...@raleigh.ibm.com>
Subject Re: Saturation stress testing httpd problems
Date Thu, 21 Sep 2000 19:58:12 GMT
"Bates, Jonathan" wrote:
> 
> Hi All,
> 
> I don't know if this is the correct forum for this problem - if not please
> let me know of another.
> 
> Here goes..
> 
> We have an infrastructure that uses Apache (latest version) running on a
> server farm of solaris boxes to support (eventually) up to 10 million
> concurrent users (apache is used as a gateway server via our specialist
> module).
> 
> I've written a test program that stresses the server by continuously sending
> a number of concurrent requests to Apache (currently configured to run 200
> or so httpd's).
> 
> When the concurrency exceeds the 200 configured httpds, Apache gets swamped
> and falls over. - This I don't mind too much, but after getting swamped it
> falls over and doesn't recover - which is a big problem.
> 
> It seems to leave thousands of opened sockets on port 80 with TIME_WAIT
> status - which can be seen via 'netstat -a'.
> 
> The problem doesn't lie in our specialist gateway module as it also happens
> in raw Apache.
> 
> Has anyone else encountered this problem, if so do you know of any
> solutions?
> 

I've seen similar problems when benchmarking Web servers, which isn't a
problem in a production environment with many client machines.  The
typical symptom is "canyons" in a thruput graph that repeat every minute
or two.  Cause: running out of unique ephemeral ports from the client
box that's driving the benchmark.  

Every TCP connection has to have a unique 4-tuple consisting of
(server-IPaddr, server-port, client-IPaddr, client-port).  If the first
three elements stay the same in your configuration, the client port must
be different in order to make it unique.  Many TCP/IP stacks have the
default ephemeral port range set to 1024-4999.  Legend has it that this
is due to a typo in an ancient BSD stack.  Whatever the cause, this is
less than 4000 unique ephemeral ports.  I can easily get Apache to
handle over 2000 requests/sec on my ThinkPad, so it's not hard to eat up
the entire range in a couple of seconds.  Then you typically have to
wait a minute or two until the TIME_WAIT guys go away.

You might want to count the number of TIME_WAITs you have. If it's
slightly less than 4000 per client box, I would bet a pint of Old
Speckled Hen that this is your problem.  You can usually configure the
client's TCP/IP stack to have a much bigger ephemeral port range.  On
Linux you can use the sysctl command which is part of the "procfs" RPM
to query/set ip_local_port_range.  Or use the older method:

cat /proc/sys/net/ipv4/ip_local_port_range                  to query
echo "1024 63000" > /proc/sys/net/ipv4/ip_local_port_range  to make it a
lot bigger.

Good luck
Greg

Mime
View raw message