Return-Path: owner-new-httpd Received: by taz.hyperreal.com (8.6.10/8.6.5) id PAA19690; Mon, 13 Mar 1995 15:56:17 -0800 Received: from get.wired.com by taz.hyperreal.com (8.6.10/8.6.5) with ESMTP id PAA19685; Mon, 13 Mar 1995 15:56:15 -0800 Received: by get.wired.com (8.6.10/8.6.5) id PAA04654; Mon, 13 Mar 1995 15:55:30 -0800 Date: Mon, 13 Mar 1995 15:55:29 -0800 (PST) From: Brian Behlendorf To: new-httpd@hyperreal.com Subject: Re: Content-type negotiation: thoughts and code In-Reply-To: <9503130435.AA02106@volterra> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-new-httpd@hyperreal.com Precedence: bulk Reply-To: new-httpd@hyperreal.com Okay, review time: On Sun, 12 Mar 1995, Robert S. Thau wrote: > First off, with regard to the question of whether to support > CERN-style auto-arbitration based on filename extensions, or to have > explicit map files, I'd like to suggest that we can afford to support > both. Directories take very little extra code to support beyond what > you need to handle the contents of the map files themselves --- from > the server's point of view, they're just another form of map file, > which happens to come pre-parsed. However, the gain in convenience > for people who can use the feature is substantial. > > (Of course, some people *don't* want to use the feature, so it needs > to be a configuration option. The way I've done it in my code above, > you need "Options MultiViews" enabled in a directory in order for "GET > /.../foo" to be resolved to "/.../foo.gif" or "/.../foo.jpeg". In > directories where MultiViews is off, the server behaves exactly as it > would if the directory-scanning code were never there, so I don't > *think* there's a back-compatibility issue ;-). This is all solid, I agree. In testing it doesn't look like the automatic selection is working exactly right. The .map-based system apparently works fine (try ) , but apparently the q variable isn't being used properly (or maybe I don't understand it) in the auto-negotiation: There's an "index2.html" and "index2.html3". I added an AddType to bind .html3 to text/x-html3. Working directly with the server ("telnet hyperreal.com 8000"), I try 1) HEAD /index2 HTTP/1.0 Accept: text/x-html3 returns text/x-html3, no problem 2) HEAD /index2 HTTP/1.0 Accept: text/html returns text/html, no problem 3) HEAD /index2 HTTP/1.0 Accept: text/x-html3 Accept: text/html returns text/x-html3, no problem 4) HEAD /index2 HTTP/1.0 Accept: text/html; q=0.1 Accept: text/x-html3; q=1.0 returns text/x-html3, no problem 5) HEAD /index2 HTTP/1.0 Accept: text/html; q=0.600 Accept: text/x-html3; q=0.800 returns text/html, which doesn't make sense, unless there's some internal preference for html over x-html3 (and I'm not using a map file). These are the q values sent by Arena. Order doesn't seem to matter. Any clue? Also, I can set DirectoryIndex to a .map file. Yayay! Can I set it to "index2"... nope. How much work is involved in being able to set the DirectoryIndex to use autonegotiation? Finally, I noticed that I can't yet have server > A second interesting thing which comes up is what to do with clients > like certain, ahem, colorful browser betas which ship completely bogus > Accept: headers (or HTTP/0.9 browsers, which don't ship any). My code > basically pretends the browser did "Accept: text/html" and "Accept: > text/plain" whether it actually did or not, to ease this difficulty; > is that the right thing? Sounds fine to me. > Then there's security. The issue here is that if some Malevolent > Entity (say, a cracker exploiting a leaky ftp server) can create > type-map files, you don't want the server believing one which names > /etc/passwd as the text/plain view of the composite entity described > by /inoccuous/directory/pretty-bunny.map. My code takes the > thoroughly draconian approach of making all pathnames in map files > relative to the map file itself, and *disallowing* relative paths > containing '/', so a type-map file can *only* name things in the same > directory (although those can be symlinks *if* FollowSymLinks is > enabled). Is that the wrong thing? If so, what's the right thing? Hmm - another directive a la FollowSymLinks seems in order. At least allowing a map file to go down directories would be good. Actually "Includes" seems like it covers the same grounds security-wise... > As a final point, writing the code raises the question of what the > map files should look like. What I've done probably isn't the right > thing, but a discussion of what's wrong with it might prove > instructive. The map files implemented by my code just look like: > > foo.au: audio/basic > foo.gif: image/gif > foo.html: text/html > foo.txt: text/plain Looks great to me. > My question for the group is, what else do we want? Extra MIME header > lines? Some way of discriminating on USER_AGENT? This gets back to the whole meta-information debate - where can we allow people to add the "Refresh: 10" lines in the absence of a file system that makes this easy. I think that the purpose for this and content negotiation is roughly seperable - thus, we could have something like a ".meta" file in each directory which acted as a way to store metainformation about different files in that directory. It would have the performance penalty of a stat() and a read if it exists, which means people should use it sparingly and in directories with not too many other files, but it could be useful. Whether we should allow conditionals in the format ("if(USER_AGENT) =~ /*Mozilla*/", etc) is a big question - for ease of implementation I'd argue against it for now at least. Also, documentation: Randy, since you're in charge of the HTML end of the apache web pages, would you be willing to handle this? Each patch should at least have a mention. Brian --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- brian@hotwired.com brian@hyperreal.com http://www.hotwired.com/Staff/brian/