httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <>
Subject Re: recursive robot queries
Date Mon, 01 Jan 2001 07:19:40 GMT
From: "Roy T. Fielding" <>
Sent: Monday, January 01, 2001 12:34 AM

> > These are allowed to happen due to content negotiation - any extra
> > information after a valid link is presumed to simply be PATH_INFO
> > information.  So in the example, the above URL will pull up
> > the page "/index", i.e. index.html, with "/full/foundation/...." as the
> > PATH_INFO.  How did this recursion start?
> Blecko... there needs to be a way for ssi files to declare that they
> are going to use path_info (or declare that they are not) so that the
> server can redirect or block access to bogus URLs.

I've thought alot this last week about our ssi suport (trying to parse
FAQ.html, for one, and dealing with other aspects under 2.0.)

We've got alot of conditions now that SSI was just never ment to
cope with.  Take an include of a footer.html from index.html.ja.jis ...
where the charset changes from the main body to the included body.
Some include targets nearly require their own 'mini-headers' ... 
header processing from ssi that would allow a subrequest working with
mod_charset_lite, for example, to force the subrequest encoding
back to the parent encoding.

Not to mention that we could kill etag/lastmodified issues for good,
or employ cache control headers.

The reason I've hedged on a rewrite of mod_autoindex is that it's
really a specialized case of ssi.  [that ought to make heads spin.]
Right now it's outside in, I want to look at it inside out.  It will
really make life simple for customizing and fancy indexing, and we
can get rid of the "Header and readme names aren't picking up the
right files!" bugs.

I'm sure we can come up with a ton of these.  The real questions are,
is SSI dead (as a 'growing' entity), in the sense that real growth
of that spec is no longer worthwhile?  If not, how do we begin to
make it relevant in a mixed-content, HTTP/1.1 world?

If it's going to be relevant, I believe we need to begin dealing
with the 'contents as a whole', merge included headers, perhaps
even deal with 'our' meta-tags (heck, we are parsing the document,
why not deal with a content-language meta tag?)  I'm guessing we
do some of this now, and ignore most of it.

Welcome to this next thousand years, everyone :-)


View raw message