httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From (Jim Gettys)
Subject Re: Apache 2.0 ideas
Date Tue, 03 Nov 1998 18:31:20 GMT

> From: Dean Gaudet <>
> Date: Tue, 3 Nov 1998 10:08:57 -0800 (PST)
> To: Jim Gettys <>
> Cc:
> Subject: Re: Apache 2.0 ideas
> -----
> On Tue, 3 Nov 1998, Jim Gettys wrote:
> > No, from userland, the fastest server will be one which caches (small)
> > objects in memory, and then does a single send() of the cached memory.
> >
> > File opens are expensive.  Save sendfile() for big objects, where the
> > open overhead isn't significant.
> We can argue about it, but the best thing would be to measure ;)

Yup.  Measurement is the only way.

> open()s aren't as expensive under linux as they are elsewhere... and
> sendfile() isn't "thread safe" in the sense that you can use a single fd
> with multiple threads (so caching open fds isn't worth it).  Linus keeps
> claiming that open() is the way to go, it'd be worthwhile to prove or
> disprove his claim.

Open on UNIX/Linux is relatively cheap; but this is cheap relative to
other operating systems (on a 1 mip vax, a file open on VMS was 10x as
expensive as on UNIX, at 1/4 CPU second). Things have changed somewhat
as the system speeded up, but I'd be amazed if Linus does too much better
than "conventional UNIX".

I think Linus is wrong here (from first hand experience).

And it doesn't solve your synchronization problem; a file can be updated
rather than replaced.

I agree with the attitude that mesurement is best; but 2 system calls/request
(open(), sendfile(); relative to a fraction of one isn't even remotely
comparable.  When I get my home system up under Linux, though, I may make
some measurements.
> To cache things in memory requires synchronization between threads... to
> use open() lets the kernel do its best job of synchronization... which is
> really where I prefer to let that happen.  If userland could do fancy
> spinlock tricks I wouldn't worry about it so much.  But those are
> extremely non-portable.  I'd rather give the kernel as many opportunities
> as possible to parallelize on SMP systems.  ('cause then it's the kernel
> folks' problems to make things go fast ;)

Userland can do fancy spin-lock tricks, on good systems; they do a system
call only if the lock is contended for for a significant period.
Acquiring a lock should only be the cost of a trip to main memory,
on a good operating system.

Build for a good system here; those who don't measure up will profile,
find they are spending too much time in some part or the other, and
then fix their systems.  So long as Apache can be made to run on most
systems, you've won the portability game.  Vendors (and Linux) will
fix their systems as it becomes clear there is a win.

> > Fundamentally, for a pipelined server with good buffering, you can end
> > up with much less than one system call/operation.  This is what makes
> Yeah I showed this with apache 1.3 with a few small tweaks -- the main one
> required is to get rid of the calls to time() and use a word of shared
> memory for the time.  (This is functionality the kernel/libc folks should
> provide, either through shared mem or through the now ubiquitous time
> stamp counters on all modern processors.)  I showed 75 responses in 21
> syscalls.

Yup; X had the time problem too: events get timestamped.
The solution in X was either:
	o just do a time call on every batch  or input
	event. (crummy systems)
	o good systems put the time into a shared memory interface
	and update by the device driver as each event occurred. (smart systems)
Unfortunately, there are lots of dumb X ports out there (haven't seen
what XFree86 does).

> > load (when cycles are scarcest); this is the ideal situation.  A web
> > server probably can't be as simpleminded, but you get the idea anyway.
> In theory it can -- if you're doing userland threads and they're
> multiplexed with select() then you get much of the benefit of how X works.
> That's why I find the userland and userland/kernel hybrid approaches to
> threading so much more interesting than pure kernel threads.
> (Note:  I know we could write a webserver without threads, much like
> squid, but it couldn't be apache then -- it's too hard to do general
> module support without threads or processes.)
> > The problem this model faces for a Web server is how the server gets
> > informed that its underlying database is different, so that it can't
> > trust its in memory copy.  I leave this as an exercise to the readers :-).
> The web server has one other thing going for it in kernel land -- intense
> usage of cached disk data.  X doesn't have that.  For example, a cached
> 1Mb file requires 256 4K pages.  If you've got an intelligent network card
> you can completely avoid 256 TLB misses on each response doing the work in
> the kernel -- or by providing a sendfile()-style interface... anything to
> avoid the need for v->p mappings.

Yup; my message said that sendfile() would likely be a win for large objects;
here the system call overhead is not significant.  But most web objects
are small; system overhead dominates most of the time.

Even this isn't necessarily true; on some systems, you can have large
TLB entries (megabytes in size).  The question I have no data on is
how many systems actually do anything with vadvise() to get the
system to "do the right thing".  But I believe a compromise between all
one or the other is a portable, high performance solution.  Exactly where
the size of objects is would be worth some performance analysis
once one has running code.

> I really have to put a caveat on all of this:  I'm just blowing hot air, I
> haven't measured any of this, and I'm not likely to do it soon.

I measured all of this for X, way back when (another words, I don't believe 
I'm blowing ANY smoke).  We had to beat the competition (of the day), 
which were kernel based systems.

It is truly a win to have the ease of debugging in user space.

I guarantee that with cleverness and care one will be able to always
beat kernel systems in userland.

I used to characterize X as "X is an exercise in avoiding system calls".
It is a good mantra.
				- Jim

View raw message