httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Kosut <ako...@leland.Stanford.EDU>
Subject Re: Apache 2.0/NSPR
Date Fri, 11 Sep 1998 16:44:38 GMT
On Fri, 11 Sep 1998, Dean Gaudet wrote:

> > 4) A lot of the cache could be made kernel resident. This means that  
> > although cache invalidation can be complicated (objects can have a
> > validate method), simple tests should be handlable simply and predictably-
> > for example file modification date, eq checks for negotiation etc.
> > Otherwise you have to leave the kernel to validate.
> For some sites it'd be sufficient to invalidate the entire cache on a
> regular basis.  That's pretty easy to do.  But yeah invalidation is in
> general a painful problem...

Yep... Hey; I've got a brilliant idea. Let's redefine some terms. Apache
only serves valid, up-to-date data. Therefore, anything Apache isn't
serving must be invalid or old. So we don't need to expire anything,
because Apache is perfect :)

Probably what we need are some generic cache-invalidation methods. I can
think of three:

1. Algorithms, based on object metainformation. i.e., like the Expires
   header (when the date becomes later, it's invalid), though you could go
   a bit more complex than that.

2. Invalidation callbacks, a la HTTP's If-Modified-Since; a function to
   call to determine if the item is still valid. Some content
   will require this, although it kills any chance of doing real fast
   kernel-style delivery of the cached item.

3. AFS-like "I'll tell you when it's changed" invalidation. I'm not quite
   sure how we'd make this one work. Probably something to do with
   threads. But it's probably the best way to ensure that the server
   nevers delivers out-of-date content for an arbitrarily changing object


> On a related note, we need to abstract those bits of the filesystem which
> we need to serve HTTP so that backing store doesn't have to be a
> filesystem.  I'd say this should even go as far as modules being able to
> open() a URI and read from it -- so that it can be as transparent as
> possible.  So rather than use ap_bopenf() directly (the apache-nspr
> equivalent of fopen()), modules open a URI through some other interface,
> and that may go to the filesystem or go to a database/whatever.

Yes! That's exactly what I was thinking last night. This not only gives us
the advantage of allowing other backing stores, it means all of those
objects are cached, and also that you can apply filters to them, just like
any file.

This brings up some interesting possiblities, if every file is treated
like this, even, say, the config files. Because you wouldn't need any
special hooks or *anything* special to enable, say, a macro language, or
even a completely different, GUI-based config format. You just configure
the server (I'll leave out the obvious chicken-vs-egg question for now) to
pass the config file URIs through a module that does whatever you want.


> A difficulty in this is access control -- there's a difference between an
> external request requesting some files, and an internal module requesting
> them.  Different rights. 

Hmm. Hadn't really thought of that. What we really need to do, as Simon
alluded, is apply proper file system semantics to the URI space. With
classes (groups) of users and varying access privilidges.

Of course, Microsoft's already thought of this. But instead of building a
filesystem-like access control and object retrieval system into IIS, they
just modified the NT filesystem. I suggest we not do that.

> > 6) This implies that the namespace model should be mappable in terms of
> > directories, files, and specials (cgi-scripts, etc). This gives the
> > hierarchical component of the resolution process a higher priority than
> > the other phases. 
> I'd like to see the namespace have "mount points" somewhat like the unix
> filesystem.  This controls the hierarchy as far as what the underlying
> store is... and it's a simple system, easy to optimize for.  i.e. I'd
> really like to avoid "if the URI matches this regex then it's served by
> database foobar".  That's far too general.

I'm sure you would like to see that, Dean. :) I doubt it will happen,
though. Users like to be able to do odd things with the URLs they type in. 
And because URIs and URNs never took off, we're stuck with
resource-specific URLs, like http://foobar/file.html. If you have a
widely-publicized file like that, and suddenly decided you want to
exchange it for a database-backed file, it's hard to convince everyone,
using ESP perhaps, to instead go to http://foobar/database/notafile.html.

I think we're going to have to add generic, regex-like matching mechanisms
at some level of the URI parsing code. It may be at a higher level than it
is right now (i.e., perhaps URI rewriting from external URIs to internal
not-quite-URIs), but it'll still have to be there.

-- Alexei Kosut <> <>
   Stanford University, Class of 2001 * Apache <> *

View raw message