httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiří Eichler <ejir...@seznam.cz>
Subject Re: [users@httpd] Wrong charset convert
Date Wed, 01 Jul 2009 15:40:37 GMT
I understand you very well. I can try to look at Apache source codes and 
recompile it with some changes.
Otherwise, thank you for your time. I'm trying to resolve this for three 
days :-D

Have a nice day,
Jiri Eichler

André Warnier wrote:
> Jiří Eichler wrote:
> ...
> Hi.
> I do not know the answer precisely either.
> But I know enough to tell you that in such matters, you must be 
> /extremely/ careful in interpreting what is really going on, at each 
> level.
> Just as a stupid example : when you look at a log file, you must know 
> : - has the process that writes that logfile transformed the data into 
> some encoding already, when writing it to the logfile ?
> - is the editor which I am using aware of the logfile encoding ?
> etc...
> Because otherwise what you see, and what is really there, may be 
> different things.
>
> For example, I think I remember that, internally, in the Windows NTFS 
> filesystem, file names are stored as Unicode (not necessarily UTF-8, 
> it could also be UTF-16 or another Unicode encoding).
> (See for example here : http://www.ntfsrecovery.com/a-ntfs.php)
> But when you look at a directory through the Explorer, these internal 
> filenames /may/ get transformed according to your PC's codepage, just 
> to display it to you.
> So what you think you see, is not necessarily what is really there.
> Understand what I'm saying ?
>
> Just some elements :
> - Apache should not "translate" or "encode" the received URL, because 
> basically it does not know if this URL is in UTF-8, ASCII, or any 
> other encoding. There is no "flag" or "header" in a HTTP request, that 
> says in which encoding the "GET" line comes in. (e.g. it may also be 
> some Japanese or Chinese encoding).
> So it /must/ take it as bytes.
> - then Apache calls the OS to find the file.  There may, or may not, 
> be some translation there, I really don't know.  It may depend on what 
> API call the program uses to read the directory, and I don't know what 
> Apache uses.
> - it's the same for your C program.  I don't know if the OpenFile() 
> call interprets "name" as a pure byte sequence, or if it converts it 
> internally, or whatever.
> - and we don't know if Apache and your program use the same API calls.
>
> For example, in Java or Perl, there are different ways to open a file 
> and to read/write from it, some with encoding/decoding going on, some 
> not.  Unfortunately, I am incompetent in C and Windows API, so I don't 
> know in that case.
>
> Obviously something is happening somewhere, and obviously it happens 
> differently under Unix and under Windows.
>
> Under Unix/Linux, most of these things are influenced by the "locale" 
> under which the process is running.  Under Windows, it is usually the 
> whole system-wide "International settings" which count.
>
> I think we need an Apache/Windows developer here, to really tell us 
> what is going on.
>
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server 
> Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message