httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Sutton <p...@ukweb.com>
Subject [BUG] mod_neg charset bug and update
Date Thu, 20 Feb 1997 12:19:43 GMT
I noticed a bug in the accept-charset code whilsh Looking mod_negotiation. 
It always gives variants with iso-8859-1 charsets a charset quality of
1.0, even if the accept-charset header gives iso-8859-1 a different
quality. There is also a problem when some variants have a charset and
some don't. 

In more detail, if the client sends an Accept-Charset such as

   Accept-Charset: wibble; q=0.2, flan; q=0.7

then variants with charset wibble get a charset q=0.2 and those with flan
q=0.7. But rfc2068 says that _in addition_ variants with charset
iso-8859-1 are acceptable, so they get an assumed q=1.0 (and are thus
preferred, despite what the header says). Apache already does all this.

However the implicit use of q=1.0 for iso-8859-1 means that in order for
a UA to say it does not want iso-8859-1 it has to, for example, send

  Accept-Charset: wibble; q=0.2, flan; q=0.7, iso-8859-q; q=0.01

In the current code the q value for iso-8859-1 is ignored, and iso-8859-1
variants always get q=1.0. This is a bug which makes means Apache does
not honour the q value for iso-8859-1 in the second example.

The attached patch fixes this.

While I was doing this I also considered the situation where you have some
variants with have a charset and some which don't. Currently those which
don't are given a q=1.0. Thus a variant with no charset would be preferred
on the above example Accept-Charset header. There are three ways to handle
over variants with charsets which are listed variants with no character
set assigned: 

   1  As at present, give variant a charset q=1.0
   2  Assume variant is iso-8859-1 and give it the q of iso-8859-1 from
      the Accept-Charset header, or 1.0 if not present
   3  Give variant a low priority

The second option initially seems the best, but I don't want to build an
assumption about what character documents are in into Apache. It is the
logical equivalent of saying "if a variant has no language, assume it is
English". Given the international nature of Apache and the web, I don't
think we should make any assumptions about the character set in use on the
server. 

That leaves option 3. This is coded in the enclosed patch.

//pcs

Mime
View raw message