apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wilfredo Sánchez Vega <wsanc...@wsanchez.net>
Subject Re: apr_filepath_encoding on Darwin
Date Wed, 18 Jul 2007 16:55:50 GMT
On Jul 18, 2007, at 2:11 AM, Joe Orton wrote:

> - it is convention on all modern Unixes I'm aware of that filename
> charset/encoding follows LC_CTYPE; not just Linux.  It may derive from
> Solaris, I think that's where the locale APIs originate.

   I guess I don't know how that works in practice.  When you have an  
encoded string, you need to know it's encoding.  On a file system,  
there is no meta data (typically) to indicate the encoding of the file  
name string.

   So I set my locale settings to correspond to encoding A and write a  
file.  Yours is encoding B.  On Linux, one expects the file name to  
display differently for the other user?

   What we do is expect the application to translate as appropriate,  
so that both users see the same string regardless of the locale  
settings.  (Note that in Darwin, I don't actually think that most  
command line applications work well with locales, so I'm referring  
mostly to GUI apps.)  So in my example, even though your encoding and  
mine are different, it's UTF-8 on disk, and the appropriate encoding  
when displayed.

> - AFAIK this convention is not standardised anywhere.

   It should at least be documented; word-of-mouth is a poor way to  
apply convention.  But that's neither here nor there.

> - Linux-the-kernel is no different from any other Unix kernel in this
> respect; it doesn't care about filename charset/encoding and doesn't  
> set
> policy for userspace.  Many Linux distributions set up UTF-8 locales
> (via $LANG etc) by default, and expect applications to follow the
> convention.
> - if Darwin has a configurable locale, does *not* set this up by  
> default
> such that nl_langinfo(CODESET) returns UTF-8, but does by policy  
> require
> filenames in UTF-8, regardless of locale, I would agree with changing
> apr_filepath_encoding as Erik proposed.  That is the case?

   I don't know what the BSD locale system (nl_langinfo , whatever)  
does in Darwin; I've never worked with it.  I only know that for file  
names, we tell developers to use UTF-8.


View raw message