stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Sebor (JIRA)" <j...@apache.org>
Subject [jira] Commented: (STDCXX-435) [Linux] std::codecvt_byname("*.UTF-8").in() to_next greater than expected
Date Mon, 04 Jun 2007 23:48:25 GMT

    [ https://issues.apache.org/jira/browse/STDCXX-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501407
] 

Martin Sebor commented on STDCXX-435:
-------------------------------------

The problem seems to be caused by the fact that in libc mode (i.e., when using the underlying
C library) codecvt_byname calls (via __rw_libc_do_in) mbsrtowcs() to convert the source sequence
without bothering to make sure it's NUL-terminated. The function attempts to convert the source
sequence up until the terminating NUL (or an invalid byte) or until it has produced the requested
number of destitation characters. When the destination buffer is large enough for more the
number of characters in the source sequence the function just keeps converting past the end.

> [Linux] std::codecvt_byname("*.UTF-8").in() to_next greater than expected
> -------------------------------------------------------------------------
>
>                 Key: STDCXX-435
>                 URL: https://issues.apache.org/jira/browse/STDCXX-435
>             Project: C++ Standard Library
>          Issue Type: Bug
>          Components: 22. Localization
>    Affects Versions: 4.1.3
>         Environment: gcc version 4.1.1 20070105 (Red Hat 4.1.1-51)
>            Reporter: Mark Brown
>
> When compiled with gcc 4.1.1 on Linux the program below runs successfully to completion
as it should. When compiled with stdcxx the facet returns  a to_next value that is greater
than the number of internal (wchar_t) characters actually produced by the conversion and consequently
the program aborts.
> $ cat t.cpp && make t && ./t
> #include <cassert>
> #include <cwchar>
> #include <locale>
> int main ()
> {
>     const std::locale utf8 ("en_US.UTF-8");
>     typedef std::codecvt<wchar_t, char, std::mbstate_t> UTF8_Cvt;
>     const UTF8_Cvt &cvt = std::use_facet<UTF8_Cvt>(utf8);
>     const char src[] = "abc";
>     wchar_t dst [2] = { L'\0' };
>     const char* from_next;
>     wchar_t* to_next;
>     std::mbstate_t state = std::mbstate_t ();
>     const std::codecvt_base::result res =
>         cvt.in (state,
>                 src, src + 1, from_next,
>                 dst, dst + 2, to_next);
>     assert (1 == from_next - src);
>     assert (1 == to_next - dst);
>     assert ('a' == dst [0]);
> }
> gcc -c -I/home/mbrown/stdcxx/include/ansi -D_RWSTDDEBUG    -I/home/mbrown/stdcxx/include
-I/build/mbrown/stdcxx-gcc-4.1.1-11S/include -I/home/mbrown/stdcxx/examples/include  -pedantic
-nostdinc++ -g   -W -Wall -Wcast-qual -Winline -Wshadow -Wwrite-strings -Wno-long-long -Wcast-align
  t.cpp
> t.cpp: In function 'int main()':
> t.cpp:21: warning: unused variable 'res'
> gcc t.o -o t  -L/build/mbrown/stdcxx-gcc-4.1.1-11S/lib  -lstd11S -lsupc++ -lm 
> t: t.cpp:26: int main(): Assertion `1 == from_next - src' failed.
> Aborted

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message