incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Sebor <>
Subject Re: STDCXX-435
Date Tue, 11 Mar 2008 23:42:16 GMT
I don't think it's that simple. IIRC, the problem is that we're using
mbsrtowcs() on sequences that aren't guaranteed to be NUL-terminated.
It seems that it should be straightforward to either specify a limit
to mbsrtowcs() that's small enough so as to prevent the function from
ever reaching the end of the (non-NUL-terminated) source sequence, or
make sure there is a NUL at the end (by copying the short subsequence
at the end of the source sequence into a small local buffer and NUL
terminating it there. I recall trying to get that approach to work
at first and failing. I don't remember why it didn't work anymore,
if it was because the code became too complex and inefficient or
if it simply wasn't possible to guarantee that it would be correct
in all cases.

In any event, I came to the conclusion that we can't call mbsrtowcs()
to convert whole ranges of characters at once but that we must do the
conversion one character at a time instead. That's what the attached
patch does (I think). Since I wrote it months ago I don't remember
how extensively I tested it, or if it even applies cleanly after so
much time has passed. It does appear to fix the bug in the originally
submitted test case though :)


Eric Lemings wrote:
>>>From Martin's comments in the Jira issue:
> "...The function attempts to convert the source sequence up until the
> terminating NUL (or an invalid byte) or until it has produced the
> requested number of destitation characters. When the destination buffer
> is large enough for more the number of characters in the source sequence
> the function just keeps converting past the end."
> More than that, I think the call to mbsrtowcs() is using the wrong
> length.  It's using dst_len when it should be using src_len which is
> smaller in the test case listed in the issue.
> Some debugging excerpts:
>     __rw_libc_do_in (state=@0x7fff5d880800
> <mailto:state=@0x7fff5d880800> , from=0x7fff5d880820 "abc",
>     from_end=0x7fff5d880821 "bc", from_next=@0x7fff5d8807f8
> <mailto:from_next=@0x7fff5d8807f8> , to=0x7fff5d880810,
>     to_limit=0x7fff5d880818, to_next=@0x7fff5d8807f0
> <mailto:to_next=@0x7fff5d8807f0> )
>     at /work/stdcxx/trunk.1/src/wcodecvt.cpp:357
>     372         const _RWSTD_SIZE_T ret = mbsrtowcs (pdst, &psrc,
> dst_len, &state);
>     (gdb) print src_len
>     $6 = 1
>     (gdb) print dst_len
>     $7 = 2
> I think it should actually use min(src_len, dst_len).
> Thoughts?
> Brad.

View raw message