httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roy T. Fielding <>
Subject removing AddDefaultCharset from config file
Date Fri, 10 Dec 2004 10:12:25 GMT
I've looked back at the Jan-Feb 2000 discussion regarding cross-site
scripting in an attempt to find out why AddDefaultCharset is being
set to iso-8859-1 in 2.x (but not in 1.3.x).  I can't find any rationale
for that behavior -- in fact, several people pointed out that it would
be inappropriate to set any default, which is why it was not set in 1.3.

The purpose of AddDefaultCharset is to provide sites that suffer from
poorly written scripts and cross-site scripting issues an immediate
handle by which they can force a single charset.  As it turns out, 
a charset does nothing to reduce the problem of cross-site scripting
because the browser will either auto-detect (and switch) or the user,
upon seeing a bunch of gibberish, will go up to the menu and switch
the charset just out of curiosity.  The real solutions were to
stop reflecting client-provided data back to the browser without first
carefully validating or percent-encoding it.

To make matters worse, the documentation in the default config is
completely wrong:

     # Specify a default charset for all pages sent out. This is
     # always a good idea and opens the door for future 
     # of your web site, should you ever want it. Specifying it as
     # a default does little harm; as the standard dictates that a page
     # is in iso-8859-1 (latin1) unless specified otherwise i.e. you
     # are merely stating the obvious. There are also some security
     # reasons in browsers, related to javascript and URL parsing
     # which encourage you to always set a default char set.
     AddDefaultCharset ISO-8859-1

First, it only applies to text/plain and text/html, in spite of the
convoluted implementation in core.c.  Second, setting a default in the
server config actually hinders internationalization because normal 
don't understand config files.  Furthermore, it causes harm because
it overrides the indicators present in the content. There is some 
to make for doing that to CGI and SSI output for the sake of protecting
idiots from themselves, but not for flat files that do not contain any
generated content.  And the security reasons are not fixed by overriding
the charset anyway -- that just makes it easier for people to ignore the
real problems of unencoded data.  All that is really needed is the
availability of the directive so that *if* a site or tree is subject to
the XSS problem, then the server admins can set a default.

In short, unless someone can think of a justification for the above
being in the default config for 2.x, I will delete it soon and close
the festering PR 23421.


View raw message