httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <>
Subject Filesystem utf-8 i18n?
Date Mon, 02 Oct 2000 23:11:20 GMT

  Does the list feel acceptance of UTF-8 extension of URL resources
is established enough to build something on the premise?

  It would be near impossible and not worthwhile to rework the 
entire Apache server code base to accomodate UNICODE.  URL resources
in a country-dependent charset are best left to an input filter (to
translate to UTF-8 or the native codepage, if you can figure out what
that might be :-)  This could include requery where UTF-8 fails to
locate a resource, but requering in the client's code page succeeds.
That would entirely be an input filtering issue.

  I understand that Un*x-in-a-box will live just fine with UTF-8
characters, although they may mean nothing when listed to a 
conventional terminal.

  My underlying thought for WinNT plus other Unicode enabled platforms
would be to accomodate encoded UTF-8 in the filename (and other resource 
strings) and modify the apr_fopen and other apr functions to accomodate
the same (utf-8 -> unicode transformation, then a call to the wide char
function.)  It could be optimized that if the parser found no hints, it
wouldn't bother.  In the case that the file/resource is UTF-8, we would
even have to change the spawn to pass an entire wide environment for 
cgi, including the command line.  But on 'raw' platforms, it would just 
pass raw UTF-8 which works if recognized, or fails as not found.

  Does anyone object on concept before I tweak around with the idea?
This could perhaps make mod_autoindex illegible, but then again, it
already is if the user is saving filenames as UTF-8 in Unix.  It aught
to be a directive to mod_autoindex, such as AutoIndexIsUTF8 or something.
Whatever can't be accepted can be fixed up by the Jeff's i18n filter.


[Yes Roy... I see you are following the list... prepared to be blasted :-]

View raw message