httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy T. Fielding" <>
Subject Re: removing AddDefaultCharset from config file
Date Sat, 11 Dec 2004 02:20:21 GMT
On Dec 10, 2004, at 4:19 AM, Joe Orton wrote:
> My understanding was that the forced default charset *does* prevent
> browsers (or maybe, MSIE) from guessing the charset as UTF-7; UTF-7
> being the special case as it's already an "escaped" encoding and hence
> defies normal escaping-of-client-provided-data tricks.  Is that not
> correct?

Yes and no -- it is both the source of the problem and the biggest
reason that we should NOT set charset as a default.

Consider the following two identical content resources, the first
being sent as

      Content-Type: text/html; charset=ISO-8859-15

and the second being sent with only

      Content-Type: text/html

I've tested the above with all of my browsers.  Safari and MSIE-Mac do 
support utf-7 at all.  Firefox (Mac and Win) supports utf-7 but only 
manually set (it does not auto-detect utf-7, even when read from a 
local file).

MSIE (Windows), of course, does the least intelligent thing -- it does
not allow users to select utf-7 manually, but does auto-detect and 
utf-7 if it is read from a local file, or if "auto-detect" is enabled
regardless of the content-type charset parameter -- setting charset has
no effect on MSIE's auto-detect results.  In other words, it
is only at risk for XSS via utf-7 if auto-detect is enabled.

The problem we have created is that AddDefaultCharset causes entire
sites to default to one charset, usually iso-8859-1.  And because it
is set by default (no brains spent thinking about the right value),
it is often set that way even when installed in non-Latin countries
[and there is also a problem in Europe, since iso-8859-15 is where
the euro symbol was added].  As a result, normal users get a higher
frequency of wrong charset declarations in HTTP, for which the only
"standards-compliant" solution short of manually adjusting every
page received is to turn on auto-detect!  In other words, our default
is now causing more users to be vulnerable to utf-7 XSS attacks than
they would otherwise be if we never sent a default charset.

In any case, the only tutorials on cross-site scripting that still
emphasize setting charset is our own (written by Marc) and CERT's
(based on input from Marc).  Those were intended to be temporary
workarounds until folks had a chance to fix the real problems, which
were non-validating scripts that echo untrusted content to users.

After doing another afternoon of research on this one, I am now 
that AddDefaultCharset does far more harm than good.


View raw message