apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Zhakov <i...@visualsvn.com>
Subject Re: svn commit: r1860745 - /apr/apr/trunk/file_io/win32/dir.c
Date Tue, 11 Jun 2019 20:48:36 GMT
On Tue, 11 Jun 2019 at 17:14, William A Rowe Jr <wrowe@rowe-clan.net> wrote:
> On Tue, Jun 11, 2019 at 4:15 AM Branko ─îibej <brane@apache.org> wrote:
>> On 07.06.2019 21:58, William A Rowe Jr wrote:
>> > I think the optimal way is to allocate a pair of apr thread-specific
>> > wchar buffers in each thread's pool on startup, and use those
>> > exclusively per-thread for wchar translations. We could be looking at
>> > 64k/thread exclusively for name translation, but it doesn't seem
>> > unreasonable.
>> >
>> > The alternative is to continue to use stack, we surely don't want to
>> > lock on acquiring or allocating name translation buffers. /shrug
>> Since this is Windows, and there's no embedded Windows environment left
>> that I'm aware of that's still alive, we can continue with using the
>> stack ... but wouldn't it be so much better to alloca() the required
>> size instead of blindly burning through 64k every time? Obviously that
>> means counting the characters first.
> I don't think there is any need to do so. A name buffer needs 32k (right now
> we limit that to 8k, but if we wanted to satisfy the OS-acceptable pathname
> length, we would blow that out 4x the current arbitrary limit.) For a rename,
> make that 2x buffers. But there is no benefit to wasting cycles determining
> the string length ahead of allocation, because the very next call can run
> right past that limit and hit the wall on available stack. Demanding that there
> always be a potential 32k runway of available stack doesn't seem excessive.
> The point to the stack is that it contracts immediately on return. So we aren't
> burning through a 64k buffer - we are ensuring that we have been topped-up
> to 64k remaining. The targets of these calls are all Win32 API invocations,
> we are never nesting them inside further big-buffer allocations of our own.
> What might happen in ntdll we have little control over.
>  We either reserve about 2x buffers for file name transliteration in heap
> per thread, or we use the thread stack. As long as we trust that our utf-8
> to ucs-2 logic is rock solid and the allocations and limits are correctly
> coded, this continues to be a safe approach. The 32k/64k are immediately
> given back in the current stack-based approach, but would simply be boat
> anchors most of the time if they are set aside in heap.
> I'm +/-0 on switching from stack to heap for this particular transformation
> but welcome good suggestions.
> .
I agree that in general allocating 16kb on stack should be avoided. However,
I also think that is it's not too problematic in this specific case, as it
happens within a function that only calls the Win32 API.

Also, this particular code should probably be changed anyway to avoid doing
UTF8->UCS conversions per every apr_dir_read(), and I have that on my TODO

Speaking of this commit, my aim here was to fix accessing the uninitialized
memory, as I did in r1860747. And this turned out to be much simpler with
the ANSI code path removed.

Ivan Zhakov

View raw message