Subject: Re: Patch go boom...
To: new-httpd@hyperreal.com
Date: Tue, 10 Oct 1995 12:51:41 +0100 (BST)
From: Ben Laurie <ben@gonzo.ben.algroup.co.uk>
In-Reply-To: <9510101102.AA11492@tees.elsevier.co.uk> from "Andrew Wilson" at
 Oct 10, 95 12:02:22 pm
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 3147      
Message-ID: <9510101251.aa18336@gonzo.ben.algroup.co.uk>
Sender: owner-new-httpd@apache.org
Precedence: bulk
Reply-To: new-httpd@apache.org

> 
> > > Mmm, 
> > > 
> > > 	04a_ExtraPath.0.8.14.patch
> > > 
> > > For 04a_ExtraPath.0.8.14.patch Ben L writes:
> > > 
> > > 	Changelog: Prevent Apache from serving /x/y/z/a as /x/y when z/a
> > > 		doesn't exist.
> > > 
> > > I sort of thought we'd already hammered this out a couple of rounds ago.
> > > I recall Rob H and I fussing because we both knew of code that used to
> > > expect a /x/y CGI-BIN script to be passed /z/a as PATH_INFO.  Rob H, you
> > > wanna confirm?
> > > 
> > > Ben, do you want to compare this with:
> > > 
> > > 	http://sumwarez.com/cgi-bin/test-cgi/foo/bar
> > > 
> > > where PATH_INFO == /foo/bar
> > > 
> > > Or am I missing summink.
> 
> 
> Ben:
> 
> > Yep. This patch does not affect cgi scripts PATH_INFO stuff. The patch
> > prevents ordinary pages from exhibiting this bizarre behaviour.
> 
> But, but...  it's not bizarre.  For a URL like:
> 
> 	http://where/foo.html/bar/baz
> 
> there's a web resource called foo.html which can receive /bar/baz as PATH_INFO.
> If foo.html is a SSI-enabled page (chmod u+x, or renamed to *.shtml) then
> PATH_INFO is passed to the SSI environment and everyone's happy.  In this
> sense foo.html is working as a script.
> 
> But if foo.html is just a regular page (no SSI) then why should the server
> behave differently?  Specifically, why should the browser be made to care
> whether or not the resource can make use of the additional path information?
> 
> A counter argument would be:
> 
> 	"Sure, then what's to stop people from sending URLs like:
> 	http://where/aaaa/any/old/stuff/and/nonsense"
> 
> and my response would be:
> 
> 	"Provided there's an 'aaaa' or 'aaaa/any' or 'aaaa/any/old' etc,
> 	etc, then it doesn't matter.  Search the URL from left to right
> 	stopping at the last matching resource (.html, .shtml, .cgi) and
> 	everything remaining to the right is for the resource to deal with."
> 
> RobH:
> > I think I remember someone saying that PATH_INFO can be used by SSI,
> > so is the patch still necessary? 
> 
> Well, I was confused.  This patch has no effect if foo.html is SSI
> enabled.  But that's not the point.
> 
> I don't like this patch ;)  But I wonder if we all agree about what URLs
> really mean.  For my argument a URL != UNIX file, and I believe we'd be
> limiting the flexibility of the server by adding this new behaviour.
> 
> Any offers?  Perhaps Roy F's got a clue here?
> 

I agree in essence; a URL is not a file. However, it seems to me that the
whole URL should be used to determine the content. Redundant extra bits
can be ignored, but it seems more sensible and useful to not ignore them.
Of course, in the case of SSI and CGI enforcing this is beyond the remit of
the server, but where plain ordinary files are concerned, the server can
see that there is extra meaningless stuff and should complain appropriately.
If there are some that think the old behaviour is useful, we can make it a
configurable flag.

> Ay.
> 

-- 
Ben Laurie                  Phone: +44 (181) 994 6435
Freelance Consultant        Fax:   +44 (181) 994 6472
and Technical Director      Email: ben@algroup.co.uk
A.L. Digital Ltd,
London, England.