httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dirk-Willem van Gulik <>
Subject Re: off subject .. in URLs
Date Wed, 24 Mar 1999 09:43:36 GMT

Ian Kallen wrote:
> That's not the point.  Since Roy seemed to suggest that pathes that have
> an extra level of indirection should just be externally redirected to the
> "real" path, I'm wondering if that means that any mangled path (extra
> levels of indirection, extra path delimiters, etc) should be handled in
> that way.  For my purposes wrt aggregating traffic, I've been fixing all
> of that crap up when I do log analyses, but hey if Apache should be
> 302'ing (actually, 301, right?) the request that's news to me.

Hmm.. I am a bit worried to where this is going. Let me try to write
down what I think is the reasoning of what apache should do; and then
see how that compares whith what it does do.

Although normally the URI passed to apache is a URL, and thus if one 
encodes href="../zappa.zop" in the HTML document zoink/fred/froo.html, 
a click should cause a fetch for zoink/zappa.zop rather than for 
zoink/fred/../zappa.zop. Cause that is how relative paths are specified. 

However if the document (or form, java-applet of frob-application) has a
URI which is	http://froo.frob/zappa/.././....././oink.html you have
to be really carefull as to what extend you allow apache to fiddle with it.

With the proxy, socked-backends, etc, etc it might be quite valid (or even
be something like urn:morsespace:froo.frob/.-./..-/..../.././-. In other
words whatever travels on the wire between the GET and the HTTP/1.0 on
the first line of the request is kind of sacred.

Once you are _sure_ that the URI (or the Path part of it) is really for you
as the server; and really is going to point to a 'physical' file on a (unix like)
file system; then the story becomes a different one; as you have to deal with
the semantics of the '/' and the '.' in a unix file specification context
as opposed to their role in a URI.

Their meaning (though semantically quite equivalent) is fundamentally not the 
same as their hierarchical meaing in an HTTP (or ftp) URL as defined in RFC1610 
or there about. Then, and only then should you be allowed to touch such 'special
characters'. The same applies to things like 'C|' and 'C:' for windows. But you
are then firmly in server application land. At this stage you are fine to reply
with a 30x; which IMHO means like 'well this x uri really should be written
as y to make sense to me in this context'. 

And I think that this is what apache  does right now. I hope.

Do I still make sense ?


View raw message