tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Kuba <>
Subject Re: [Bug 23929] - request.setCharacterEncoding(String) doesn't work
Date Thu, 15 Jan 2004 15:31:14 GMT

>>> ------- Additional Comments From  2004-01-14 13:03 
>>> There is a standard for encoding URIs 
>>> (
>>> code.html) but this standard is not consistently followed by clients. 
>>> This causes a number of problems.
>>> 2. The Coyote HTTP/1.1 connector has a URIEncoding attribute which 
>>> defaults to ISO-8859-1.

Why is the default iso-8859-1, when the recommended encoding
for URIs is UTF-8 ? That doesn't make sense.

I found following in Tomcat-dev archive:

 > >> Tomcat will default to US-ASCII instead of UTF-8 so it won't break
 > >> too many existing webapps.  If there are other parts to this story,
 > >> I would be interested in learning of them.

I think that it is false. If some webapplication did not care
about i18n, it cannot be broken by using UTF-8 instead of ISO-8859-1.
And if some webapplication used i18n, it was not using ISO-8859-1.

By the way, there is no *standard* which says that URLs should be in UTF-8. is not a standard,
it is a web page in "Hints&Tips" section :-)

The RFC 2396 (URI syntax) doesn't recommend utf-8, it just says that
"For example, UTF-8 [UTF-8] defines a mapping from sequences of
octets to sequences of characters in the repertoire of ISO 10646."
That's the only place where UTF-8 is mentioned in RFC 2396.

The RFC 2718 (Guidelines for new URL Schemes) is talking about *new*
URL schemes, not about the old http scheme.

If anybody knows about any other standard which mandates UTF-8
for http URL, please let me know.

Supercomputing Center Brno             Martin Kuba
Institute of Computer Science    email:
Masaryk University   
Botanicka 68a, 60200 Brno, CZ     mobil: +420-603-533775

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message