tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregor Schneider <rc4...@googlemail.com>
Subject Re: Tomcat 5 and UTF-8
Date Thu, 02 Apr 2009 17:54:03 GMT
On Thu, Apr 2, 2009 at 7:30 PM, Je suis la poubelle <lapsap7@gmail.com> wrote:
> On Fri, Mar 27, 2009 at 5:34 PM, Christopher Schultz <
> chris@christopherschultz.net> wrote:
>
>
> Setting charset/encoding is to specify computerized information.  It's
> not just a matter of language.  If setting charset in META tag doesn't mean
> anything to you, the same argument applies to setting charset in HTTP
> header.
>

Well, this is the only argument I can agree upon.

But encoding of HTML/XML is the story of which was there first: The
hen or the egg?

I'll give you an example based on our dreadful experiences with XML-parsing:

Let's say, we have a stream looking like this:

<?xml version="1.0" encoding="UTF-8"?>
   <foo>bar</foo>
</xml>

However, the encoding of the whole stream is done in some wierd
encoding you've never heard about.

See, the parser needs to know about the encoding /in advance/ to be
able to read the encoding from said stream.

See the point?

Actually, it's a good practice to put the encoding, but that's about
it, and same goes for a META-TAG.

Talking web, the only thing a parser can rely on is a HTTP-Header.

And it's getting really nuts, when it comes to UTF-8: Talking about
UTF-8 with or without BOM? Even the specs are not clear about that.

In my oppinion, the whole character-set is a pain in the ass:

I personally wish IETF came up with some specs saying something like

"the first n bytes of any stream have to be encoded in ASCII containg
length and encoding-type of the rest of the stream".

I put that on my whishlist for xmas.

Rgds

Gregor
-- 
just because your paranoid, doesn't mean they're not after you...
gpgp-fp: 79A84FA526807026795E4209D3B3FE028B3170B2
gpgp-key available
@ http://pgpkeys.pca.dfn.de:11371
@ http://pgp.mit.edu:11371/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message