httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Kuba <>
Subject Re: [users@httpd] URI with European characters‏
Date Fri, 04 Feb 2011 10:16:06 GMT
Dne 4.2.2011 06:48, NLR REDDY napsal(a):
> Hi,
> We are implementing a German language website but our servers are located in USA. We
have created files and folders for the site in german characters (like schließen.html). Now,
the problem is apache
> is unable to decode when the user clicks a link which has uri with german characters.
I see 404 status message even though I can see the file in the folder. Can anyone help me
resolve the issue.
> version of apache being used: 2.0.52.


do not use non-ASCII characters in URLs. I repeat DO NOT USE NON-ASCII CHARACTERS IN URLS

The problem is that RFCs defining URLs and later URIs are not defining
which encoding is used for non-ASCII characters. The first RFCs for URLs
were silent about encoding, later RFCs for HTML suggested iso-8859-1,
and later RFCs for URIs recommended UTF-8. But there is no way
how to specify which encoding is really used in URL.

So the only safe way is to use ASCII characters only. You can express
non-ASCII characters using ASCII by writing them as %XX where
XX are hexadecimal digits, however in such way you express *bytes*,
not *characters*, and that's big difference for non-ASCII characters.

Even if you painfully ensure that all your URLs are in UTF-8 encoded
in %XX and no buggy MSIE browser breaks them, there still may be
problems when translating URLs to filesystem names.

I know what I am talking about, my native language is Czech which uses
a lot of non-ASCII characters, and I have experience with Czech websites
for the last fourteen years.


Supercomputing Center Brno             Martin Kuba
Institute of Computer Science    email:
Masaryk University   
Botanicka 68a, 60200 Brno, CZ     mobil: +420-603-533775

View raw message