httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Min Xu <...@cae.wisc.edu>
Subject Strange Behavior of Apache 2.0.43 on SPARC MP system
Date Mon, 10 Feb 2003 19:50:35 GMT
Hi All,

Sorry I am posting this directly to the development list. But
I think this is not a user setup problem and it is so strange
maybe only you guys will have a clue on what's going on.

I am a student of UW-Madison. In order to study computer
architecture of commercial multiprocessor servers, we have used
APACHE as one of our important workloads.

I am the one who setup the workload on a 14-processor Sun
Enterprise server. During setup I found a very strange behavior
of the apache server (running with worker MPM). Essentially the
strange thing is that:

  The server optimal throughput is not achieved by using a
  greedy client, who drive the server with no think time. But
  with tiny amount of think time, much better throughput is
  archievable. Also, with the greedy client, the server's
  performance decreased over time, which seems to be very
  counter-intuitive.

Of course, just give you the short decription above does not
help you to help me. So I will give you the detail problem
description and data in the following. With the understanding
of the source code, maybe you can give me some more hypothesises
to try on.

Workload background
-------------------
The setup of apache workload on is fairly simple comparing with
some of the other workloads we have (OLTP). In this workload, we
have a HTTP server and an automatic request generator(SURGE).
Both of the programs are highly multi-threaded. The server has
a pool of static text files to be served from a known URL to the
request generator (the client). The size of the files follows a
statistical distribution. And the client has multiple threads each
emulate a user who access a serial of files in fixed order.

In previous setup of the workload, we have removed client think time.
The basis of that is the following: (we also have to put the server
and the client on the same machine for other reasons)

The workload (server + client) is a closed queueing system. The
throughput of the system is ultimately determined by the bottleneck in
the system. Having think time in the client only increase the parallelism
in the system. It shouldn't change the maximum throughput too much.
BTW, our goal is to archieve realistic workload setup with available
hardware.

If you think about it, for our current server throughput level, say 5000
trans/sec, if each user have 1 second think time between fetching each
file, this will need 5000 users to sustain this throughput. On the other
hand, if we remove the think time from the client, maybe 10 users can also
generate the same 5000 requests per second. So the difference here is that
one server has 5000 httpd threads and the other has only 10 httpd threads.
10 won't be worse(in terms of server behavior) than 5000, right? Greedy
client won't be worse(in terms of performance) than the lazy client, right?

Well it is not that simple...


I know how to get higher performance, but I don't know why it works!
--------------------------------------------------------------------
I have two version of surge clients in my hand. One is the original,
one is my modified version. The difference between them would be the
client efficiency. My modified version would fetch files more efficiently
(because I made it emulate a simpler user) and have less thread
switching overhead.

However, when I comparing the server throughput using these two clients,
I got very surprising results, roughly:

  old client: 3000 trans/sec
  new client: starts out from 4700 trans/sec, gradually degrade to 2500
              trans/sec after 10-20 minutes of runtime.

And this really puzzled me for a long time. My supposedly performance
enhancement did not improve the server Xput, but hurt it!

Turns out the reason for this is the new client was too efficient! I
added the think time between each URL request and new client was able
to drive the server Xput to as high as 5000 trans/sec. But note, the
real interesting thing is not the think time, but how sensitive the
Xput was affected by it.

I'd prefer to call the think time "delay time" in the following because
I really only introduced very small amount of delay between each file
fetch. The result can be seen in the following plots:

http://www.cs.wisc.edu/~xu/files/delay_results.eps
http://www.cs.wisc.edu/~xu/files/side1.eps
http://www.cs.wisc.edu/~xu/files/side2.eps

In this experiment, instead of using both old and new version of the
client, I just used the new version with varying delay time and number
of threads. Since there are two dimensions of freedom in the client,
the plot is in 3D. The figures side1 and side2 is roughly the 2D
projection of the Xput vs. thread and Xput vs. delay time.

Each point on the plot is a 30 minutes benchmarking on a 14P MP system.

Clearly, driving the server using no delay time is not optimal. No
matter using same amount of threads or less number of threads, the
server Xput is no higher than delayed counterparts. However, you can see,
the server Xput raise rapidly with client number when delay time is 0.
On the other hand, with small number clients, server Xput is reverse
proportional to the the delay time. And with larger clients number,
server Xput is proportional to delay time.

I don't understand why small(1-3us, with nanosleep on of solaris) delay
time would help?

Some hypothesises are that apache server itself have some internals to
slowdown greedy clients. Or Solaris did not schedule the server threads
well enough to handle short request interval. Or, the greedy client
consumed too much cpu time?

I'd appreciate any suggestions/comments from you.

-Min

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Mime
View raw message