Return-Path: Delivered-To: apmail-stdcxx-issues-archive@locus.apache.org Received: (qmail 41076 invoked from network); 13 Feb 2008 19:59:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Feb 2008 19:59:43 -0000 Received: (qmail 95529 invoked by uid 500); 13 Feb 2008 19:59:37 -0000 Delivered-To: apmail-stdcxx-issues-archive@stdcxx.apache.org Received: (qmail 95509 invoked by uid 500); 13 Feb 2008 19:59:37 -0000 Mailing-List: contact issues-help@stdcxx.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@stdcxx.apache.org Delivered-To: mailing list issues@stdcxx.apache.org Received: (qmail 95500 invoked by uid 99); 13 Feb 2008 19:59:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Feb 2008 11:59:37 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Feb 2008 19:58:48 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 439D6714079 for ; Wed, 13 Feb 2008 11:59:08 -0800 (PST) Message-ID: <31447791.1202932748274.JavaMail.jira@brutus> Date: Wed, 13 Feb 2008 11:59:08 -0800 (PST) From: "Martin Sebor (JIRA)" To: issues@stdcxx.apache.org Subject: [jira] Issue Comment Edited: (STDCXX-499) std::num_put inserts NUL thousand separator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/STDCXX-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559638#action_12559638 ] sebor edited comment on STDCXX-499 at 2/13/08 11:57 AM: --------------------------------------------------------------- I'm tempted to close this as Won't Fix since it looks like a rare bug in the locale definition file. On recent Linux systems there's just one locale that suffers from this problem: bg_BG. I couldn't find any such locales on HP-UX. We might want to look to see how many others besides fr_FR.ISO8859-1 there are on Tru64, and check other platforms to see if it's more pervasive than just one or two locales. For future reference, here's an inefficient shell scrip I used to find other such locales: {noformat} for l in `locale -a`; do \ LC_NUMERIC=$l locale -ck LC_NUMERIC | grep "thousands_sep=\"\"" >/dev/null; \ if [ $? -eq 0 ]; then \ L="$L $l"; \ fi; \ done \ && for l in $L; do \ grp=`LC_NUMERIC=$l locale -ck LC_NUMERIC | grep grouping`; echo $l ": " $grp; \ done {noformat} was (Author: sebor): I'm tempted to close this as Won't Fix since it looks like a rare bug in the locale definition file. On recent Linux systems there's just one locale that suffers from this problem: bg_BG. I couldn't find any such locales on HP-UX. We might want to look to see how many others besides fr_FR.ISO8859-1 there are on Tru64, and check other platforms to see if it's more pervasive than just one or two locales. For future reference, here's an inefficient shell scrip I used to find other such locales: {noformat} for l in `locale -a`; do LC_NUMERIC=$l locale -ck LC_NUMERIC | grep "thousands_sep=\"\"" >/dev/null; if [ $? -eq 0 ]; then L="$L $l"; fi; done && for l in $L; do grp=`LC_NUMERIC=$l locale -ck LC_NUMERIC | grep grouping`; echo $l ": " $grp; done {noformat} > std::num_put inserts NUL thousand separator > ------------------------------------------- > > Key: STDCXX-499 > URL: https://issues.apache.org/jira/browse/STDCXX-499 > Project: C++ Standard Library > Issue Type: Bug > Components: 22. Localization > Affects Versions: 4.1.2, 4.1.3, 4.1.4 > Reporter: Martin Sebor > Assignee: Martin Sebor > Priority: Minor > Fix For: 4.2.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > Moved from Rogue Wave Bugzilla: http://bugzilla.cvo.roguewave.com/show_bug.cgi?id=1913 > -------- Original Message -------- > Subject: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 16:10:23 -0500 > From: Boris Gubenko > Reply-To: Boris Gubenko > Organization: Hewlett-Packard Co. > To: Martin Sebor > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > v2.0 also) is making sure, that num_put does not insert null thousand > separator character into the stream. Here is the fix in _num_put.cc > in v3.0 : > template > */> > _TYPENAME num_put<_CharT, _OutputIter>::iter_type > num_put<_CharT, _OutputIter>:: > _C_put (iter_type __it, ios_base &__flags, char_type __fill, int __type, > const void *__pval) const > { > const numpunct &__np = > _V3_USE_FACET (numpunct, __flags.getloc ()); > // FIXME: adjust buffer dynamically as necessary > char __buf [_RWSTD_DBL_MAX_10_EXP]; > char *__pbuf = __buf; > const string __grouping = __np.grouping (); > const char *__grp = __grouping.c_str (); > const int __prec = __flags.precision (); > #if defined(__VMS) && defined(__DECCXX) && !defined(__DECFIXCXXL1730) > const char __nogrouping = _RWSTD_CHAR_MAX; > if (!__np.thousands_sep()) > __grp = &__nogrouping; > #endif > Here is the test: > cosf.zko.dec.com> setenv LANG fr_FR.ISO8859-1 > cosf.zko.dec.com> locale -k thousands_sep > thousands_sep="" > cosf.zko.dec.com> cxx x.cxx && a.out > null character thousand_sep was not inserted > cosf.zko.dec.com> cxx x.cxx -D_RWSTD_USE_CONFIG -D_RWSTDDEBUG \ > -I/usr/cxx1/boris/CXXL_1886-2/stdlib-4.0/stdlib/include/ \ > -nocxxstd -L/usr/cxx1/boris/CXXL_1886-2/result/lib -lstd11s \ > && a.out > null character thousand_sep was inserted > cosf.zko.dec.com> > x.cxx > ----- > #ifndef __USE_STD_IOSTREAM > #define __USE_STD_IOSTREAM > #endif > #include > #include > #include > #include > #include > #ifdef __linux > #define FRENCH_LOCALE "fr_FR" > #else > #define FRENCH_LOCALE "fr_FR.ISO8859-1" > #endif > using namespace std; > int main() > { > ostringstream os; > if (setlocale(LC_ALL,FRENCH_LOCALE)) > { > setlocale(LC_ALL,"C"); > os.imbue(locale(FRENCH_LOCALE)); > os << (double) 10000.1 << endl; > if ( (os.str())[2] == '\0' ) > cout << "null character thousand_sep was inserted" << endl; > else > cout << "null character thousand_sep was not inserted" << endl; > } > return 0; > } > ------- Additional Comments From sebor@roguewave.com 2005-01-11 14:50:44 ---- > -------- Original Message -------- > Subject: Re: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 15:50:06 -0700 > From: Martin Sebor > To: Boris Gubenko > References: <00f201c4f821$fa0b72c0$29001c10@americas.hpqcorp.net> > Boris Gubenko wrote: > > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > > v2.0 also) is making sure, that num_put does not insert null thousand > > separator character into the stream. Here is the fix in _num_put.cc > > in v3.0 : > I don't think this fix would be quite correct in general. NUL is > a valid character that the locale library was specifically designed > to be able to insert and extract just like any other. In addition, > in the code below, operator==() need not be defined for the character > type. > > > ... > > Here is the test: > Thanks for the helpful test case. > My feeling is that this case points out a fundamental design > disconnect between the C and C++ locales. In C, NUL is not > an ordinary character -- it's a special character that terminates > strings. In addition, C formatted I/O is done in multibyte > characters. In contrast, in C++, NUL is a character like any other > and formatted I/O is always done in single chars (or wchar_t when > char is not wide enough), but never in multibyte characters. > In C, the thousand separator is a multibyte string so even if > grouping is non-empty, inserting an empty string will be as good > as inserting none at all. In C++ the separator is assumed to be > a single character so there's no way to achieve the same effect. > Instead, whether a thousand separator gets inserted or not is > controlled by the grouping string. > One way to fix this would be to set grouping to "" if thousands_sep > is NUL, although that would be quite correct, either because numpunct > can be used directly by user programs. I'll have to think about how > to deal with this. In the meantime, I filed bug 1913 for this problem > so that you can track it. > Martin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.