httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Behlendorf <>
Subject Re: Content-type negotiation: thoughts and code
Date Mon, 13 Mar 1995 23:55:29 GMT

Okay, review time:

On Sun, 12 Mar 1995, Robert S. Thau wrote:
> First off, with regard to the question of whether to support
> CERN-style auto-arbitration based on filename extensions, or to have
> explicit map files, I'd like to suggest that we can afford to support
> both.  Directories take very little extra code to support beyond what
> you need to handle the contents of the map files themselves --- from
> the server's point of view, they're just another form of map file,
> which happens to come pre-parsed.  However, the gain in convenience
> for people who can use the feature is substantial.
> (Of course, some people *don't* want to use the feature, so it needs
> to be a configuration option.  The way I've done it in my code above,
> you need "Options MultiViews" enabled in a directory in order for "GET
> /.../foo" to be resolved to "/.../foo.gif" or "/.../foo.jpeg".  In
> directories where MultiViews is off, the server behaves exactly as it
> would if the directory-scanning code were never there, so I don't
> *think* there's a back-compatibility issue ;-).

This is all solid, I agree.  In testing it doesn't look like the 
automatic selection is working exactly right.  The .map-based system 
apparently works fine (try <URL:>)
, but apparently the q variable isn't being used 
properly (or maybe I don't understand it) in the auto-negotiation:

There's an "index2.html" and "index2.html3".  I added an AddType to bind
.html3 to text/x-html3. Working directly with the server ("telnet 8000"), I try

1) HEAD /index2 HTTP/1.0
   Accept: text/x-html3		returns text/x-html3, no problem

2) HEAD /index2 HTTP/1.0
   Accept: text/html		returns text/html, no problem

3) HEAD /index2 HTTP/1.0
   Accept: text/x-html3
   Accept: text/html		returns text/x-html3, no problem

4) HEAD /index2 HTTP/1.0
   Accept: text/html; q=0.1	
   Accept: text/x-html3; q=1.0	returns text/x-html3, no problem

5) HEAD /index2 HTTP/1.0
   Accept: text/html; q=0.600
   Accept: text/x-html3; q=0.800   

     returns text/html, which doesn't make sense, unless there's some 
     internal preference for html over x-html3  (and I'm not using a map 
     file).  These are the q values sent by Arena. Order doesn't seem to

Any clue?

Also, I can set DirectoryIndex to a .map file.  Yayay!  Can I set it to 
"index2"... nope.  How much work is involved in being able to set the 
DirectoryIndex to use autonegotiation?  Finally, I noticed that I can't 
yet have server

> A second interesting thing which comes up is what to do with clients
> like certain, ahem, colorful browser betas which ship completely bogus
> Accept: headers (or HTTP/0.9 browsers, which don't ship any).  My code
> basically pretends the browser did "Accept: text/html" and "Accept:
> text/plain" whether it actually did or not, to ease this difficulty;
> is that the right thing?

Sounds fine to me.  

> Then there's security.  The issue here is that if some Malevolent
> Entity (say, a cracker exploiting a leaky ftp server) can create
> type-map files, you don't want the server believing one which names
> /etc/passwd as the text/plain view of the composite entity described
> by /inoccuous/directory/  My code takes the
> thoroughly draconian approach of making all pathnames in map files
> relative to the map file itself, and *disallowing* relative paths
> containing '/', so a type-map file can *only* name things in the same
> directory (although those can be symlinks *if* FollowSymLinks is
> enabled).  Is that the wrong thing?  If so, what's the right thing?

Hmm - another directive a la FollowSymLinks seems in order.  At least 
allowing a map file to go down directories would be good.  Actually 
"Includes" seems like it covers the same grounds security-wise...
> As a final point, writing the code raises the question of what the
> map files should look like.  What I've done probably isn't the right
> thing, but a discussion of what's wrong with it might prove
> instructive.  The map files implemented by my code just look like:
> audio/basic
>   foo.gif: image/gif
>   foo.html: text/html
>   foo.txt: text/plain

Looks great to me.  

> My question for the group is, what else do we want?  Extra MIME header
> lines?  Some way of discriminating on USER_AGENT?  

This gets back to the whole meta-information debate - where can we allow 
people to add the "Refresh: 10" lines in the absence of a file system that
makes this easy.  I think that the purpose for this and content 
negotiation is roughly seperable - thus, we could have something like a 
".meta" file in each directory which acted as a way to store 
metainformation about different files in that directory.  It would have 
the performance penalty of a stat() and a read if it exists, which means 
people should use it sparingly and in directories with not too many other 
files, but it could be useful.  Whether we should allow conditionals in 
the format ("if(USER_AGENT) =~ /*Mozilla*/", etc) is a big question - for 
ease of implementation I'd argue against it for now at least.

Also, documentation: Randy, since you're in charge of the HTML end of the 
apache web pages, would you be willing to handle this?  Each patch should 
at least have a mention.  



View raw message