httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From (Jim Gettys)
Subject Re: Apache 2.0 ideas
Date Tue, 03 Nov 1998 16:31:55 GMT

> Sender:
> From: Dean Gaudet <>
> Date: Mon, 2 Nov 1998 23:47:28 -0800 (PST)
> To:
> Subject: Re: Apache 2.0 ideas
> -----
> On Tue, 3 Nov 1998, Andrew Finkenstadt wrote:
> > On further reflection and after reading the "Halloween Document" (
> > ) and Microsoft's alleged desire
> to
> > more tightly integrate IIS into the kernel, ...
> IBM and Sun have already done it.
> > Yes, it would leave behind many flavors of Unix that don't have good support
> > for shared memory, but it would beat the pants out of Microsoft.
> Why worry about shared memory?  We're not going to get anywhere further in
> the performance game without threads.  There's no point in even worrying
> about comparing the performance of unixes that lack threads... if they
> lack threads they probably also lack all the fundamental TCP/IP
> improvements necessary to even think about comparing HTTP performance.
> > We should take a page from Oracle's book on semaphores and enqueues, by
> making
> > the critical sections as small as possible, and as fine-grained as possible,
> > allowing multiple processes access to the data without road-blocking.
> There's essentially no userland syncrhonization required in a static
> content web server (i.e. a benchmark web server).  For example on linux
> open()/sendfile() should produce the fastest web server possible from
> userland... and there's nothing in there which requires userland to
> synchronize (you have to do a little magic with memory allocation).  So
> this is easy.

No, from userland, the fastest server will be one which caches (small)
objects in memory, and then does a single send() of the cached memory.

File opens are expensive.  Save sendfile() for big objects, where the
open overhead isn't significant.

Lets take a page with embedded objects, most of which are small enough
to be cached.  If you do the server right and are caching in main memory
(rather than always sending from files), you can potentially do multiple
objects in a single writev() system call.

This beats a sequence of open()/sendfile()'s all to h*** and gone.

Fundamentally, for a pipelined server with good buffering, you can end 
up with much less than one system call/operation.  This is what makes 
the X Window System fast (when well implemented). One system call reads
a bunch of requests into a buffer, another writes the results into an
output buffer.  If there are a bunch of requests in a batch, you can
get well under 1 system call/operation (in the X case, which has a relatively
compact protocol (though not as compact as I'd do if we had it to do over
again), you are way under one system call/request.

The basic scheduling loop in the X server is to do a select(), which
tells you all the connections that have work to do; it then round robins
among those connections, and handles a buffer full of requests before
moving on; it only does another select when it has done all the work
it can on all connections.  This means the select overhead drops as
load on the server goes up, so that it runs at best performance at
load (when cycles are scarcest); this is the ideal situation.  A web
server probably can't be as simpleminded, but you get the idea anyway.

And yes, there was a crazy who thought putting X in the server was a win
as well.  Didn't end up with better performance, and never got very
stable (since a bug crashed your system, debugging was a pain).
The CPU runs just as fast in user space as in kernel...

The problem this model faces for a Web server is how the server gets
informed that its underlying database is different, so that it can't
trust its in memory copy.  I leave this as an exercise to the readers :-).
					- Jim

View raw message