Return-Path: Delivered-To: apmail-incubator-stdcxx-dev-archive@www.apache.org Received: (qmail 83577 invoked from network); 11 Jan 2008 00:45:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Jan 2008 00:45:09 -0000 Received: (qmail 38252 invoked by uid 500); 11 Jan 2008 00:44:58 -0000 Delivered-To: apmail-incubator-stdcxx-dev-archive@incubator.apache.org Received: (qmail 38191 invoked by uid 500); 11 Jan 2008 00:44:58 -0000 Mailing-List: contact stdcxx-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: stdcxx-dev@incubator.apache.org Delivered-To: mailing list stdcxx-dev@incubator.apache.org Received: (qmail 38180 invoked by uid 99); 11 Jan 2008 00:44:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2008 16:44:58 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Jan 2008 00:44:42 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id D1D71714201 for ; Thu, 10 Jan 2008 16:44:33 -0800 (PST) Message-ID: <27189479.1200012273849.JavaMail.jira@brutus> Date: Thu, 10 Jan 2008 16:44:33 -0800 (PST) From: "Martin Sebor (JIRA)" To: stdcxx-dev@incubator.apache.org Subject: [jira] Commented: (STDCXX-499) std::num_put inserts NUL thousand separator In-Reply-To: <9921079.1185297151257.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/STDCXX-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557859#action_12557859 ] Martin Sebor commented on STDCXX-499: ------------------------------------- The question is: is this our problem or one with the locale definition (such as the Bulgarian locale on Linux in the test case above). I.e., is it a valid locale that specifies a grouping but no thousands_sep? Among our own locales there is only one that fits this description suggesting it might be a bug in the locale definition: $ (cd ~/stdcxx && for f in `grep -l "^grouping *[1-9]" etc/nls/src/*`; do grep -l "thousands_sep *\"\"" $f; done) etc/nls/src/bg_BG The latest glibc bg_BG definition is the same: http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/locales/bg_BG?rev=1.7.2.2&content-type=text/x-cvsweb-markup&cvsroot=glibc I opened a glibc issue to see if they agree it's a bug: http://sources.redhat.com/bugzilla/show_bug.cgi?id=5599 If we should decide to work around it I see two possible ways of handling it in punct.cpp, after retrieving the grouping and thousands_sep for the locale using localeconv(): When grouping is not empty and valid and thsousands_sep is NUL, either a) set grouping to "", or b) set thousands_sep to some non-NUL value. Solution a) seems safer because it doesn't involve inventing a thousands_sep that's valid for the locale but the downside is that it loses potentially useful information. Solution b) leaves open the question of which thousands_sep is appropriate for the locale. > std::num_put inserts NUL thousand separator > ------------------------------------------- > > Key: STDCXX-499 > URL: https://issues.apache.org/jira/browse/STDCXX-499 > Project: C++ Standard Library > Issue Type: Bug > Components: 22. Localization > Affects Versions: 4.1.2, 4.1.3, 4.1.4 > Reporter: Martin Sebor > Assignee: Martin Sebor > Fix For: 4.2.1 > > > Moved from Rogue Wave Bugzilla: http://bugzilla.cvo.roguewave.com/show_bug.cgi?id=1913 > -------- Original Message -------- > Subject: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 16:10:23 -0500 > From: Boris Gubenko > Reply-To: Boris Gubenko > Organization: Hewlett-Packard Co. > To: Martin Sebor > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > v2.0 also) is making sure, that num_put does not insert null thousand > separator character into the stream. Here is the fix in _num_put.cc > in v3.0 : > template > */> > _TYPENAME num_put<_CharT, _OutputIter>::iter_type > num_put<_CharT, _OutputIter>:: > _C_put (iter_type __it, ios_base &__flags, char_type __fill, int __type, > const void *__pval) const > { > const numpunct &__np = > _V3_USE_FACET (numpunct, __flags.getloc ()); > // FIXME: adjust buffer dynamically as necessary > char __buf [_RWSTD_DBL_MAX_10_EXP]; > char *__pbuf = __buf; > const string __grouping = __np.grouping (); > const char *__grp = __grouping.c_str (); > const int __prec = __flags.precision (); > #if defined(__VMS) && defined(__DECCXX) && !defined(__DECFIXCXXL1730) > const char __nogrouping = _RWSTD_CHAR_MAX; > if (!__np.thousands_sep()) > __grp = &__nogrouping; > #endif > Here is the test: > cosf.zko.dec.com> setenv LANG fr_FR.ISO8859-1 > cosf.zko.dec.com> locale -k thousands_sep > thousands_sep="" > cosf.zko.dec.com> cxx x.cxx && a.out > null character thousand_sep was not inserted > cosf.zko.dec.com> cxx x.cxx -D_RWSTD_USE_CONFIG -D_RWSTDDEBUG \ > -I/usr/cxx1/boris/CXXL_1886-2/stdlib-4.0/stdlib/include/ \ > -nocxxstd -L/usr/cxx1/boris/CXXL_1886-2/result/lib -lstd11s \ > && a.out > null character thousand_sep was inserted > cosf.zko.dec.com> > x.cxx > ----- > #ifndef __USE_STD_IOSTREAM > #define __USE_STD_IOSTREAM > #endif > #include > #include > #include > #include > #include > #ifdef __linux > #define FRENCH_LOCALE "fr_FR" > #else > #define FRENCH_LOCALE "fr_FR.ISO8859-1" > #endif > using namespace std; > int main() > { > ostringstream os; > if (setlocale(LC_ALL,FRENCH_LOCALE)) > { > setlocale(LC_ALL,"C"); > os.imbue(locale(FRENCH_LOCALE)); > os << (double) 10000.1 << endl; > if ( (os.str())[2] == '\0' ) > cout << "null character thousand_sep was inserted" << endl; > else > cout << "null character thousand_sep was not inserted" << endl; > } > return 0; > } > ------- Additional Comments From sebor@roguewave.com 2005-01-11 14:50:44 ---- > -------- Original Message -------- > Subject: Re: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 15:50:06 -0700 > From: Martin Sebor > To: Boris Gubenko > References: <00f201c4f821$fa0b72c0$29001c10@americas.hpqcorp.net> > Boris Gubenko wrote: > > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > > v2.0 also) is making sure, that num_put does not insert null thousand > > separator character into the stream. Here is the fix in _num_put.cc > > in v3.0 : > I don't think this fix would be quite correct in general. NUL is > a valid character that the locale library was specifically designed > to be able to insert and extract just like any other. In addition, > in the code below, operator==() need not be defined for the character > type. > > > ... > > Here is the test: > Thanks for the helpful test case. > My feeling is that this case points out a fundamental design > disconnect between the C and C++ locales. In C, NUL is not > an ordinary character -- it's a special character that terminates > strings. In addition, C formatted I/O is done in multibyte > characters. In contrast, in C++, NUL is a character like any other > and formatted I/O is always done in single chars (or wchar_t when > char is not wide enough), but never in multibyte characters. > In C, the thousand separator is a multibyte string so even if > grouping is non-empty, inserting an empty string will be as good > as inserting none at all. In C++ the separator is assumed to be > a single character so there's no way to achieve the same effect. > Instead, whether a thousand separator gets inserted or not is > controlled by the grouping string. > One way to fix this would be to set grouping to "" if thousands_sep > is NUL, although that would be quite correct, either because numpunct > can be used directly by user programs. I'll have to think about how > to deal with this. In the meantime, I filed bug 1913 for this problem > so that you can track it. > Martin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.