incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Sebor <se...@roguewave.com>
Subject Re: STDCXX-435
Date Wed, 12 Mar 2008 16:52:00 GMT
Eric Lemings wrote:
[...]
> Right.  And I believe that limit is
> 
> 	min(__from_end-__from, __to_limit-__to)

Suppose MB_CUR_MAX=4, the size of the source sequence is 1 byte
(with no terminating NUL), and the byte doesn't form a complete
multibyte character. There's no way to tell the function not to
attempt to read past the first byte. I think this might have
been what made me decide not to use the function, although it
still seems that this case could be handled separately and that
we should still be able to call mbsrtowcs() on the longest source
sequence guaranteed not to cause the function to read past the
end of it, even in the absence of a NUL. Maybe I was just plain
lazy to implement the algorithm. It might be worth revisiting
since a few calls to mbsrtowcs() on small number of subsequences
are likely to be faster than one call to mbrtowc() for each
multibyte character.

> 
> possibly adjusted to account for the length in bytes of the appropriate
> character sequence type, e.g. mbrlen(), wcslen().
> 
>> or make sure there is a NUL at the end (by copying the short
> subsequence
>> at the end of the source sequence into a small local buffer and NUL
>> terminating it there.
> 
> Also possible.  It would be safer though more involved.
> 
>> I recall trying to get that approach to work
>> at first and failing. I don't remember why it didn't work anymore,
>> if it was because the code became too complex and inefficient or
>> if it simply wasn't possible to guarantee that it would be correct
>> in all cases.
>>
>> In any event, I came to the conclusion that we can't call mbsrtowcs()
>> to convert whole ranges of characters at once but that we must do the
>> conversion one character at a time instead. That's what the attached
>> patch does (I think). Since I wrote it months ago I don't remember
>> how extensively I tested it, or if it even applies cleanly after so
>> much time has passed. It does appear to fix the bug in the originally
>> submitted test case though :)
> 
> I'll do some manual testing with this patch -- give it a test drive.

Thanks! All I think we need to do is to rerun the codecvt tests
with it and check for regressions. If there are none let me know
and I'll commit both the patch and add the regression test.

Martin


Mime
View raw message