httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sander van Zoest <>
Subject Re: unicode file APIs (was: Re: canonical stuff)
Date Tue, 27 Feb 2001 02:22:43 GMT
On Sun, 25 Feb 2001, dean gaudet wrote:

> > The answer is to have apr_file_open_u() for opening with Unicode filenames,
> > not changing the encoding of the existing apr_file_open. You completely
> > break all possibility of writing portable apps when you do that. And APR is
> > *about* writing portable apps.
> i'm a bit of an I18N novice, but doesn't it all just magically work if you
> use UTF-8 encoding everywhere?
> UTF-8 deliberately avoids using \0 and / in the encodings.  plain ascii
> works unmodified.  unix filesystems generally support UTF-8 directly
> (because of the \0 and / avoidance).
> this allows you to have a single API which understands unicode on all
> platforms -- you don't need to have _u versions which take unicode
> strings.
> give this page a perusal:

i18n can be kind of pain when you need to convert data that you do not
know the charset for or is data you do not control.

Going to a fully ISO-10646 (UTF-8) system would kill all the issues,
but the problem is making that migration and converting everything. This
is where there isn't too much code out there that does all the mappings.

I do think, as wrowe points out, this probably should be handled inside
APR, so this way apache can handle as much as possible in ISO-10646, 
especially if everything it interacts with supports it.

Now the problem comes in when you deal with non 10646 stuff outside of
the ASCII and latin1 charsets when you have a 10646 based server. You
need to convert somehow and if we convert to UTF-8 via iconv then I
do not see an issue.
Sander van Zoest                                         []
Covalent Technologies, Inc.                 
(415) 536-5218                       

View raw message