apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William A Rowe Jr <wr...@rowe-clan.net>
Subject Re: svn commit: r1860745 - /apr/apr/trunk/file_io/win32/dir.c
Date Tue, 11 Jun 2019 22:38:51 GMT
On Tue, Jun 11, 2019 at 1:44 PM Branko ─îibej <brane@apache.org> wrote:

>  We either reserve about 2x buffers for file name transliteration in heap
> per thread, or we use the thread stack. As long as we trust that our utf-8
> to ucs-2 logic is rock solid and the allocations and limits are correctly
> coded, this continues to be a safe approach.
> Apropos of that, for 2.0 we're about to or have already ditched support
> for versions of Windows that do not have native UTF-8/UTF-16 conversions
> (ah, yes ... Windows has finally moved from UCS-2 to UTF-16). Wouldn't this
> be the right time to switch to using Windows' functions instead of staying
> with our own? Especially since, with the transition to UTF-16, we have to
> deal correctly with surrogate pairs, something our current code (IIRC)
> doesn't do.

A bit of a misnomer, the code is full of references to ucs-2 w/surrogate
support, the combo of these is utf-16. The comments can be refreshed to
today's utf-16 nomenclature.

Today's logic remains correct, and of course does the correct thing,
an unpaired utf-8 surrogate value would be very broken and even possibly a
security issue, much as decoding other invalid utf-8 bytestreams proved to

If you want to look at win32 api's, feel free to benchmark; though I doubt
would outperform the current implementation.

View raw message