apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject Re: Problem with iconv charsets...
Date Mon, 25 Jun 2007 19:31:06 GMT
Marshall Powers wrote:
> The string literal "ISO-8859-1" appears in APR and log4cxx source code. For
> example, from apr-1.2.7/misc/unix/charset.c:
> 
> APR_DECLARE(const char*) apr_os_default_encoding (apr_pool_t *pool)
> {
> #ifdef __MVS__
> #    ifdef __CODESET__
>         return __CODESET__;
> #    else
>         return "IBM-1047";
> #    endif
> #endif
> 
>     if ('}' == 0xD0) {
>         return "IBM-1047";
>     }
> 
>     if ('{' == 0xFB) {
>         return "EDF04";
>     }
> 
>     if ('A' == 0xC1) {
>         return "EBCDIC"; /* not useful */
>     }
> 
>     if ('A' == 0x41) {
>         return "ISO-8859-1"; /* not necessarily true */
>     }
> 
> Are these files generated by configure scripts/ant build files? It doesn't
> seem like they are...

Nope.  That is raw, native hackery in an effort not to think through the
problem set.  As with all APR code, patches are welcome.

Some thoughts;

 * At run-time this should probably be determined by parsing first the
   LC_CTYPE, or LC_ALL in it's absense, or the fallback to the LANG
   envvar if neither LC_ variable is defined.  The codepage follows
   the period, e.g. LANG=en_US.UTF-8 would be parsed as 'UTF-8'.

 * It's reasonably trivial, if iconv is present, to validate the -fallback-
   charset name against iconv within autoconf, presuming this even should
   be ISO-8859-1

Comments?

Mime
View raw message