httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark D. Anderson" <>
Subject Re: Apache 2.0 ideas
Date Mon, 16 Nov 1998 00:25:15 GMT
>> Or would it turn out that the file system has the same
>> set of trade-offs that a userland web server would: in the end,
>> you have don't have as much RAM as you have disk.
>Is that true though?  It's not that expensive to buy a 1Gb+ x86 system
>these days.  Does your all static-content site have that much *busy* data? 
>If so, I suspect you've got one hell of a backbone... and in that case you
>can afford an alpha with many multiple gigabytes of RAM.  If in one second
>you need more than 1Gb of data in ram, your outgoing bandwidth is at least
>1Gbyte/s... no?
>The kernel vs. userland debate requries some measurements.  On NT is
>definately a win.  On some unixes, it may not be worth the extra
>difficulty in implementing the kernel-side daemon. 

I'm working this out in my mind out loud as it were, so you'll
have to be patient with me.

It seems there are two distinct issues here:
1. sizing compute resources for optimal price/performance.
2. choosing among implementation approaches for optimal efficiency.

(1) I would agree that (a) for many web sites the entire data set can
be kept in RAM, and (b) for many web sites, the real bottleneck is
the network. However, you lost me in your transition from "1Gb" of data
to "1Gb/s" of bandwidth. The bandwidth to the system could consist
of a series of requests to the same (small amount of) data, hence far
less than 1Gb (or whatever) of RAM-or-disk is needed. Or that bandwidth
could consist of a series of requests to a constantly varying set of resources.
If the total amount of data can't affordably kept in RAM, then one
must choose the appropriate amount of RAM cache given the request stream
and one's available money. Presumably this would be based in part on 
Jim Gray's "5-minute rule" concerning cache sizing.

(2) Regarding implementation approach, what I was trying to ask in
my stumbling fashion was for a comparison between: (a) mapping the
entire disk into virtual memory, and letting the virtual memory
system just use its own algorithms for working set (this mapping
could be done directly by the kernel  to implement a http file
system, or could be done by a user process like nfsd); or
(b) implementing your own cache algorithm where "small" files
are in RAM, and "large" files are done with sendfile() -- basically,
betting that you can find a better algorithm than file servers use.


View raw message