tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Birte Glimm" <B.Gl...@e-7.com>
Subject RE: off-topic: handling non-ascii characters in URLs
Date Fri, 05 Jan 2001 11:15:24 GMT
True,
it`s the Browser that encodes the special chars I think. I sometimes had problems with not
encoded URL`s in Netscape, but the IE always translates them right.
Birte Glimm

-----Original Message-----
From: Kitching Simon [mailto:Simon.Kitching@orange.ch]
Sent: Freitag, 5. Januar 2001 11:58
To: 'tomcat-user@jakarta.apache.org'
Subject: off-topic: handling non-ascii characters in URLs


Hi All,

While following a related thread (RE: a simple test to charset), 
a question occured to me about charset encodings in URLs. 
This isn't really tomcat-related (more to do with HTTP standards) 
but thought someone here might be able to offer an answer.

When a webserver sends content to a browser, it can indicate
the character data format (ascii, latin-1, UTF8, etc) as an http
header. However, how is the character data type specified for data
send *by* a browser *to* a webserver (ie GET or POST action)?

Andre Alves had an example where an e-accent character
was part of the URL. I saw that IE4 replaced this character
with %E9 when submitting a form using GET method, but this
really assumes that the receiving webserver is using latin-1.

There is this thing called an "entity-header" defined in the HTTP
specs, which may contain a "content-encoding" entry. This seems
to cover POST urls ok then, as the POSTed data is in an entity-body,
and therefore an entity-header can be used to define its encoding.

But the URLs themselves cannot have their encoding specified by 
an entity-header, because they are not in an entity-body. So does
this mean that all URLs should be restricted to ascii, and forms
should not use GET method unless their data content is guarunteed
to be all-ascii??  I remember seeing an article recently about domain
names now being available in asian ideogram characters, which seems
to indicate otherwise....

Any comments?

Cheers,

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, email: tomcat-user-help@jakarta.apache.org

Mime
View raw message