stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Sebor <>
Subject Re: [PATCH] STDCXX-1073
Date Wed, 24 Oct 2012 15:11:36 GMT
On 10/24/2012 07:12 AM, Liviu Nicoara wrote:
> A few observations after spending a few hours working on the improved test.
> On 10/21/12 19:08, Martin Sebor wrote:
>> ...
>> There's no requirement that embedded NULs must be preserved
>> (that's just how it happens to be implemented). I find it best
>> to avoid relying on the knowledge of implementation details
>> when exercising the library so that tests don't start failing
>> after a conforming optimization or some such tweak is added
>> to the code.
> I agree with the implementation details part. However, there is only one
> way NULs get processed and that is by keeping them in the string,
> verbatim. In this respect the previous test incarnation was a stronger
> test.

Consider an encoding where UCHAR_MAX doesn't correspond
to a valid character, and a hypothetical implementation that,
before transforming the string, increments the value of each
character by 1. That will lose all the initial embedded NULs.

> I modified the test according to the suggestions. The test fails all
> corresponding wide-char cases and I am investigating other potential
> defects as well. For example, I do not think that employing strcoll and
> wcscoll in compare is correct as they stop at the first NUL, although
> strings may contain characters after the NUL that alter the result of
> the comparison.

I would expect the wchar_t specialization to be analogous
to the narrow one. In fact (without looking at the code),
I would even think both could be implemented in terms of
the same function template specialized on the character
type and on the libc string function. (Although I'm not
necessarily suggesting this as the solution to this issue.)

>> Attached is a test case that reproduces the bug without relying
>> on implementation details. The test passes with your patch,
>> confirming that it's good. This test still makes the assumption
>> that strings that lexicographically distinct strings compare
>> unequal in the en_US locale, but I think that's a safe assumption
>> regardless of the encoding (though only for the hardcoded strings).
> All narrow test cases pass fine, but I gotta look into the wide-char
> failures.

Good to hear! I wish I had more time to look into it with you.


> Liviu

View raw message