apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Zhakov <i...@visualsvn.com>
Subject Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c
Date Thu, 25 Jul 2019 12:58:39 GMT
On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <ivan@visualsvn.com> wrote:
>
> On Wed, 3 Jul 2019 at 18:37, Joe Orton <jorton@redhat.com> wrote:
> >
> > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> > > They also make the existing old API unusable for many cases thus
> > > making a switch to the new API mandatory, even though it doesn't have
> > > to be so (see below).
> > >
> > > APR 1.x did not specify that the result of apr_dir_read() has to be allocated
> > > in dir->pool:
> > >
> > >   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> > >
> > > This function does not accept a pool argument, and I think that it's natural
> > > to assume limited lifetime in such case. This is what the current major APR
> > > users (httpd, svn) do, and other users should probably be ready for that too,
> > > since on Win32 the subsequent calls to apr_dir_read() already invalidate the
> > > previous apr_finfo_t.
> >
> > Are there many examples in the APR API where returned strings are not
> > either pool-allocated or constant (process lifetime)?  It seems unusual
> > and very surprising to me.
> >
> > A hypothetical API consumer can have made either assumption:
> >
> > a) of constant memory (true with limits on Win32, breaks for Unix), or
> > b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
> >
> > neither is documented and both are broken by current implementations. It
> > is impossible to satisfy both so anybody can argue that fixing either
> > implementation to match the other is a regression.
> >
> > Doubling down on (a) over (b) seems strongly inferior to me, the API
> > cannot support this without introducing runtime overhead for all callers
> > (a new iterpool in the dir object), so I'd rather fix this properly with
> > a new API.
> >
> > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
> > the same as introducing _pread() if desired - making the strdup only
> > happen for _pread would only be a minor tweak, not a whole new subpool.
> >
>
> There is example of apr_hash_t.ITERATOR that is statically allocated in
> the hash object and returned to the caller.
>
> Also the native readdir() API behaves the same way [1]:
> [[[
> The returned pointer, and pointers within the structure, might be invalidated
> or the structure or the storage areas might be overwritten by a subsequent call
> to readdir() on the same directory stream.
> ]]]
>
> From this perspective the current behavior of apr_dir_read on Unix looks
> inconsistent and surprising for the API user because it is impossible to use
> in a loop without unbounded memory usage.
>
> I strongly agree that the revved version of this function that accepts a POOL
> argument will be good from both usage and implementation terms. I would be +1
> on adding such API. But I also think that unless we fix the original issue
> by using the preallocated buffer/iterpool, the whole thing would still be
> problematic for any existing API user.
>
> More detailed: if we take this opportunity to choose and document the API to
> consistently return a temporary result, then as the caller I can either process
> the data during iteration or manually put it into a long-living pool if that is
> necessary. I think that all current callers work with the API this way. And I
> can also switch to the new revved function to manually specify the pool and
> better control the allocations. If I would like to support both APR 1.x and 2.x
> the compatibility is also straightforward to achieve.
>
> If we instead choose an option where the results may be allocated
> in the dir object pool, then as the caller I cannot avoid unbounded memory
> usage when supporting both APR 1.x and 2.x. And this approach also introduces
> the unavoidable the unbounded memory usage for all Win32 users that
> do not update their code.
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
>
Is there any feedback or thoughts on these proposed changes?

Thanks!

--
Ivan Zhakov

Mime
View raw message