tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Tagunov <atagu...@mail.cnt.ru>
Subject Re: Character Encoding problem (umlauts, etc).
Date Sat, 06 Sep 2003 09:46:18 GMT
Hello Robert!

Robert Priest <Robert.Priest@bentley.com> wrote:
RP> I am requesting file :
RP> "/38CF278C0186B466222FC48571080B83/51/dms00051/รครครค.txt"
RP> but what is coming across in the request is:
RP> "/38CF278C0186B466222FC48571080B83/51/dms00051/???.txt"

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have <A
href="context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt">

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


Mime
View raw message