forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@upaya.co.uk>
Subject Re: Forrest and UTF-8
Date Tue, 11 May 2004 13:20:17 GMT
Sjur Nørstebø Moshagen wrote:

> På 11. mai. 2004 kl. 13.00 skrev Fabrice Bacchella:
>
>> Le 10 mai 04, à 19:04, Juan Jose Pablos a écrit :
>>
>>> +1. UTF-8 should be the default encoding.
>>>
>> Mmm. Does somebody knows the minimum version of Internet Explorer or 
>> Netscape that support UTF-8 ? Using UTF-8 might break some not so old 
>> navigators.
>
>
> The alternative is to support a (potentially large) set of other 
> encodings (which we probably should do anyway). But considering that:
> - ASCII is a true subset of UTF-8, and
> - Xalan, when serializing to HTML, will render all characters defined 
> as entities in the HTML spec as entities (= ASCII) (the defined 
> entities cover most of the non-ASCII part of the 8859-series, as well 
> as other characters),
>
> UTF-8 should be no problem for most of the browsers, even old ones. 
> AND UTF-8 solves a lot of _other_ encoding problems in a multilingual 
> world, of which many are just as problematic for old browsers as UTF-8.
>
> I first perceived the Xalan behaviour as buggy, generating unnecessary 
> large files in a UTF-8 setting (entitites use more space than a 
> multibyte UTF-8 character), but considering backwards compatibility 
> the behaviour is actually not so bad.
>
> To sum up:
> +1 - UTF-8 should be default, with alternative encodings available as 
> an option.

But, if Xalan does as you say, does the encoding make much difference?

Upayavira



Mime
View raw message