httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Hartill <r...@imdb.com>
Subject Re: Apache 1.2b8 - invalid "HTML" from ErrorDocument (fwd)
Date Mon, 21 Apr 1997 00:52:50 GMT

wish I'd never asked now...  ;-)

Apache sends text/html for ErrorDocuments that are user defined strings..

---------- Forwarded message ----------
Date: Mon, 21 Apr 1997 01:13:44 +0100 (BST)
From: WWW server manager <webadm@info.cam.ac.uk>
To: Rob Hartill <robh@imdb.com>
Subject: Re: Apache 1.2b8 - invalid "HTML" from ErrorDocument

> > Unless Apache is going to make some attempt at sending a complete HTML
> > document (including encoding of characters in the string which need it)
> > surely the response should be Content-Type: text/plain ?
> 
> Does it matter ?  :-)

It depends on whether it's going to be processed by a client that accepts
random junk or something that expects to see text/html when that's what
the server tells it!

> Seems completely harmless to me and useful if anyone decides to embed
> HTML in the error text string.

Yes, but it ought at least to be wrapped in appropriate HTML (trivial) and
documented as being sent to the client unmodified so that the text had better
be valid as HTML (e.g. person writing the config directive responsible for
any &-escapes, etc.).

The reason why it particularly irritated me was that the RewriteRule mistake
I was investigating resulted in a totally unexpected 400 error complaining
about an invalid request, yet seemed to be sending a 403 with error document
when I tried it with telnet. (It turned out that a missing "/" combined
with a buggy rewrite rule caused 400 in the browser cases - I automatically
included the / without noticing, via telnet.)

Anyway, the fact that I was being sent plain text mislabelled as text/html
caused be to waste a substantial amount of time looking for entirely the wrong
sort of problem - I thought there was some far more fundamental problem
at work, with a bad interaction between ErrorDocument and the rewriting module.
The whole point of the test that got sidetracked due to the trivial 
mistake was to confirm that a rewrite rule returning "forbidden" would trigger 
an ErrorDocument for status 403, and the fact that something was going 
bizarrely wrong (status sometimes 400 instead of 403) combined with the wrong 
Content-Type being sent led to the conclusion that something was *very* 
seriously wrong. 

It didn't occur to me that it might be a feature that the error document was
mislabelled, since Apache is normally extremely good about doing The Right 
Thing (tm). Though perhaps I should have heard alarm bells after reading 
several complaints earlier (catching up on reading news) about problems due to 
fancy indexes ending up with multiple <BODY> tags - but getting the 
"string"-format ErrorDocument definitions handled right should be much easier 
unless you want to allow people to put complete HTML documents in the string, 
with all tags (in which case a filename would be more appropriate than a 
string). Defining that the text must be appropriate for wrapping in 

<HTML>
<HEAD>
<TITLE>(standard text for error code)</TITLE>
</HEAD>
<BODY>                                               
<!-- maybe H1 heading here repeating the title -->
<!-- ErrorDocument string here as unmodified text -->
<ADDRESS><!-- configured contact address as mailto: URL -->
</BODY>
</HTML>

would seem entirely reasonable, and that's roughly how I had always assumed 
the text would be used.

It's late & I'm tired, so this probably sounds more annoyed than I intend it to 
be, but it did waste a fair bit of time (don't know how much, I wasn't watching 
the clock). Basically, it was unexpected, seems to me to be wrong, and I 
expected better from Apache. I used the simple string-format ErrorDocument
as a quick testing aid (I'd normally use a proper document for "real" 
ErrorDocuments), and it turned out to be a substantial hindrance rather than
the simple and reliable tool I'd expected.

                                        John
-- 
University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk


Mime
View raw message