Received: by taz.hyperreal.com (8.6.12/8.6.5) id KAA24555; Sun, 5 Nov 1995 10:19:12 -0800 Received: from arachnet.algroup.co.uk by taz.hyperreal.com (8.6.12/8.6.5) with SMTP id KAA24532; Sun, 5 Nov 1995 10:19:05 -0800 Received: from heap.ben.algroup.co.uk by arachnet.algroup.co.uk id aa02017; 5 Nov 95 18:18 GMT Received: from gonzo.ben.algroup.co.uk by heap.ben.algroup.co.uk id aa00673; 5 Nov 95 18:15 GMT Subject: Re: double slashes (was Re: WWW Form Bug Report: "Security bug involving ScriptAliased directories" on Linux) To: new-httpd@hyperreal.com Date: Sun, 5 Nov 1995 18:02:50 +0000 (GMT) From: Ben Laurie In-Reply-To: from "David Robinson" at Nov 5, 95 03:48:00 pm X-Mailer: ELM [version 2.4 PL24 PGP2] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 5495 Message-ID: <9511051802.aa16822@gonzo.ben.algroup.co.uk> Sender: owner-new-httpd@apache.org Precedence: bulk Reply-To: new-httpd@apache.org > > >1. There isn't a bug - appropriate configuration would solve the reported > >problem. Putting cgi-bin in your document tree seems unwise. I suppose that > >other URLs would also be able to access it > >(e.g. /somedir/../cgi-bin/somescript), but I haven't tried it. Also, > >presumably, SSIs could include them, even with the new restrictions, which > >would present an internal security problem. > > >2. Double slashes don't currently mean anything to Apache, don't make any > >great sense to me, and lead to unintuitive defeats of various useful > >mechanisms. It would seem not unreasonable to ban them, pending a defined use > >of them, or to convert them all to single slashes. > > >I don't understand the need for support of // in PATH_INFO, though. > > Err, in which case, why support the ^ character either? (Example chosen at > random.) I meant a real rather than philosophical need. // is essentially meaningless when translated to any reasonable file system (definition of reasonable left as an exercise for the reader). I was under the impression that someone was actually using // to mean something, and I wondered what. > > Let me tell a story. > > An http URL was defined as http://host[:port][path] > A path is defined as concatenation of zero or more path segments, separated > by '/'. A path segment may be _zero_ or more of a set of characters. > > "" /wibble /wombat/subres /wombat//subres /wibble//// > > are all valid paths. /wombat/subres and /wombat//subres are not required > to identify the same resource. > > Problem: the Unix file system does not a allow "" as the name of a file. > Conventional behaviour _ignores_ null path segments in pathnames passed to > the system routines; the pathnames /foo/bar and /foo//bar represent the same > file. > > So, how do we map the URL semantics to the file system's semantics? > The NCSA (& Apache 0.6.5) solution; remove null path segments from the entire > URL. Thus http://host/wombat/subres and http://host/wombat//subres access the > same resource. > > What is wrong with this? > 1. Relative links don't work. > If the document subres contains a link to ../index.html > Then when accessed as http://host/wombat/subres, the link refers to > http://host/index.html; Whereas for http://host/wombat//subres, the > link refers to http://host/wombat/index.html. > > 2. CGI scripts don't get the data they expect. > If /cgi-bin/fetch is a script then an access to > http://host/cgi-bin/fetch/some//path > calls the script with PATH_INFO set to /some/path > > 3. I don't think that documents should have multiple URLs unless the user > wanted this. Erm, but in point 1 your document not only has multiple URLs but it also behaves differently according to which one is used. No doubt one could construct a rather intriguing website like this (for instance /// would pass // on to all relative links, which could then also have different behaviour...), but is it helpful? Or is that what you are saying? > > Other solutions would be to _redirect_ the request (redirect > http://host/wombat//subres -> http://host/wombat/subres) so that relative > links 'work', or treat the request as asking for the access to a directory > ("") which does not exist, and return 404 Not Found, or 403 Forbidden. > This might optionally be not applied to the PATH_INFO that will be handled > by a CGI script. > > The Apache behaviour: > > Apache attempts to emulate the NCSA behaviour, but without removing multiple > slashes from PATH_INFO data. Unfortunately, it gets it _wrong_; although in > the majority of cases it ignores void path segments, it does not always do > so. Here are the bugs: > * Multiple slashes defeat Alias, ScriptAlias and Redirect directives. > > DocumentRoot /web/docs > ScriptAlias /cgi-bin /web/cgi-bin > > http://host//cgi-bin/c references the file /web/docs/cgi-bin/c > http://host/cgi-bin/c references the script /web/cgi-bin/c > > This is not compatible with the NCSA behaviour, which would map both > of these to the CGI script > > * Multiple slashes defeat the Userdir directive. > > http://host//~drtr/dir/ http://host/~drtr//dir/ http://host/~drtr/dir/ > > all reference the _same_ file under NCSA httpd; Apache treats the first > as a reference to /web/docs/~drtr/dir/ rather than /home/drtr/dir/ > > Similarly, AddDescription does not work. > > Whether these bugs are significant is a moot point. However, they are bugs, > and they do represent incompatibilities with NCSA, and they could catch out > the unwary. (As has happened; the original poster had assumed that > ScriptAlias /cgi-bin ... would mean that /documentroot/cgi-bin would not > be accessible to the client directly.) > > Apache should, at the very least, be consistent in its handling of void > path segments. I think the current NCSA behaviour is poor, and that > such requests should be either redirected or forbidden. The clients that > rst sees making these accesses are probably getting pagefulls of bad links. Ah. So do I understand that you are recommending that either /a//b is redirected (automatically) to /a/b, or it is forbidden? And what about PATH_INFO when you've done this? Or are you saying that only the filesystem component should be redirected? Confused, Ben. -- Ben Laurie Phone: +44 (181) 994 6435 Freelance Consultant Fax: +44 (181) 994 6472 and Technical Director Email: ben@algroup.co.uk A.L. Digital Ltd, London, England.