Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 40137 invoked from network); 25 Jun 2007 19:31:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Jun 2007 19:31:35 -0000 Received: (qmail 58774 invoked by uid 500); 25 Jun 2007 19:31:37 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 58495 invoked by uid 500); 25 Jun 2007 19:31:36 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 58480 invoked by uid 99); 25 Jun 2007 19:31:36 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jun 2007 12:31:36 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [64.202.165.221] (HELO smtpout05.prod.mesa1.secureserver.net) (64.202.165.221) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 25 Jun 2007 12:31:32 -0700 Received: (qmail 5609 invoked from network); 25 Jun 2007 19:31:11 -0000 Received: from unknown (24.15.193.17) by smtpout05-04.prod.mesa1.secureserver.net (64.202.165.221) with ESMTP; 25 Jun 2007 19:31:11 -0000 Message-ID: <468017FA.7010805@rowe-clan.net> Date: Mon, 25 Jun 2007 14:31:06 -0500 From: "William A. Rowe, Jr." User-Agent: Thunderbird 1.5.0.12 (X11/20070530) MIME-Version: 1.0 To: Marshall Powers CC: dev@apr.apache.org, 'Log4CXX User' Subject: Re: Problem with iconv charsets... References: <1182787731.27898.ezmlm@apr.apache.org> <000d01c7b744$c89e86c0$2c01a8c0@nycapt35k.com> <467FEE24.4000809@rowe-clan.net> <000e01c7b748$b0483900$2c01a8c0@nycapt35k.com> In-Reply-To: <000e01c7b748$b0483900$2c01a8c0@nycapt35k.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Marshall Powers wrote: > The string literal "ISO-8859-1" appears in APR and log4cxx source code. For > example, from apr-1.2.7/misc/unix/charset.c: > > APR_DECLARE(const char*) apr_os_default_encoding (apr_pool_t *pool) > { > #ifdef __MVS__ > # ifdef __CODESET__ > return __CODESET__; > # else > return "IBM-1047"; > # endif > #endif > > if ('}' == 0xD0) { > return "IBM-1047"; > } > > if ('{' == 0xFB) { > return "EDF04"; > } > > if ('A' == 0xC1) { > return "EBCDIC"; /* not useful */ > } > > if ('A' == 0x41) { > return "ISO-8859-1"; /* not necessarily true */ > } > > Are these files generated by configure scripts/ant build files? It doesn't > seem like they are... Nope. That is raw, native hackery in an effort not to think through the problem set. As with all APR code, patches are welcome. Some thoughts; * At run-time this should probably be determined by parsing first the LC_CTYPE, or LC_ALL in it's absense, or the fallback to the LANG envvar if neither LC_ variable is defined. The codepage follows the period, e.g. LANG=en_US.UTF-8 would be parsed as 'UTF-8'. * It's reasonably trivial, if iconv is present, to validate the -fallback- charset name against iconv within autoconf, presuming this even should be ISO-8859-1 Comments?