From dev-return-28237-archive-asf-public=cust-asf.ponee.io@apr.apache.org Mon Jul 8 17:05:48 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 44CF4180665 for ; Mon, 8 Jul 2019 19:05:48 +0200 (CEST) Received: (qmail 92147 invoked by uid 500); 8 Jul 2019 17:05:47 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 92133 invoked by uid 99); 8 Jul 2019 17:05:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jul 2019 17:05:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 958B2C0D6D for ; Mon, 8 Jul 2019 17:05:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.128 X-Spam-Level: * X-Spam-Status: No, score=1.128 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, PDS_NO_HELO_DNS=1.327, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=visualsvn.com Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id DfAJdBCrL588 for ; Mon, 8 Jul 2019 17:05:45 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::244; helo=mail-lj1-x244.google.com; envelope-from=ivan@visualsvn.com; receiver= Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 730C57E21E for ; Mon, 8 Jul 2019 17:05:44 +0000 (UTC) Received: by mail-lj1-x244.google.com with SMTP id v24so16674833ljg.13 for ; Mon, 08 Jul 2019 10:05:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=visualsvn.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=lc+EvH3UWI2rhxISq+EAcdeOtL/D9jlxvwxy4dITWeA=; b=X2vlhxokaa2ig7Hx/bdtE38ZlxOZ8Pmh+kkdQz0viSpJXNXRb/VVFgk8oG8Ahs185v cmXd8Dwb3qXN039um7lfoICHaLycFzpvvvn4Xx+KRBzl7g2i1WcBDhLLxkHcKQWjvqKk KTNQZwTNXKCGvKOcK5Bgpm8Eo/FBEYrKEniJw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=lc+EvH3UWI2rhxISq+EAcdeOtL/D9jlxvwxy4dITWeA=; b=TTtK0NcAIi3we6d2rWB8k/WSItla0ZU//3x1vqpVKOTp3vNdg9KXkqJHuteYrEQbPh w+J0Bvro0B4sA03bK6qx2cwZ5i5WDd8WzZF3JtW5Kp4UCDkIeqBD4eb8B05glFVMG739 NYhx5Z3KMkknujPM8N6i6RdDifqCTOJvzuPWJxI8r5EZPp49fnNaTEFpJnMLtz+bTTpQ l6cobdVEhetrEP5uoCi9fH+v2gC0a35NaXTBjK/G6hWKGRSvfHijI8BRu26ZjPZL6epP UKN4eOYVBHuncVA1Vzxn1QNhRWCsokTiQWQsORB8BoJHg5nT5iu77dvKiLbHV94Wm5Lp BIIQ== X-Gm-Message-State: APjAAAXDSQEf7eQGVJxtz3D0RLd+R52gk7MsS7ormcXmbPe0KLhErzNw K0fzBIjAZfthO+WR9+hkRJTWCuwVeJWZBJ6ms9s2McQMZIk= X-Google-Smtp-Source: APXvYqwV9yGBVck8D1vqKtCoLusVwskEEMRaGshAZNISCpDDx3uT2x4ypAVf2qehy2S/ciWxWrPjo+9SNJH/NCWla7Y= X-Received: by 2002:a2e:80c8:: with SMTP id r8mr10982784ljg.168.1562605542868; Mon, 08 Jul 2019 10:05:42 -0700 (PDT) MIME-Version: 1.0 References: <20190625142157.28EA93A2D13@svn01-us-west.apache.org> <20190702064938.GA3690@redhat.com> <7f7f8f86-4c9c-a739-2438-ccdbdf7b4208@apache.org> <20190702101831.GA11391@redhat.com> <20190703153721.GA21941@redhat.com> In-Reply-To: <20190703153721.GA21941@redhat.com> From: Ivan Zhakov Date: Mon, 8 Jul 2019 20:05:31 +0300 Message-ID: Subject: Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c To: dev@apr.apache.org, jorton@redhat.com Content-Type: text/plain; charset="UTF-8" On Wed, 3 Jul 2019 at 18:37, Joe Orton wrote: > > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote: > > They also make the existing old API unusable for many cases thus > > making a switch to the new API mandatory, even though it doesn't have > > to be so (see below). > > > > APR 1.x did not specify that the result of apr_dir_read() has to be allocated > > in dir->pool: > > > > https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8 > > > > This function does not accept a pool argument, and I think that it's natural > > to assume limited lifetime in such case. This is what the current major APR > > users (httpd, svn) do, and other users should probably be ready for that too, > > since on Win32 the subsequent calls to apr_dir_read() already invalidate the > > previous apr_finfo_t. > > Are there many examples in the APR API where returned strings are not > either pool-allocated or constant (process lifetime)? It seems unusual > and very surprising to me. > > A hypothetical API consumer can have made either assumption: > > a) of constant memory (true with limits on Win32, breaks for Unix), or > b) of pool-allocated string lifetime (works for Unix, breaks for Win32) > > neither is documented and both are broken by current implementations. It > is impossible to satisfy both so anybody can argue that fixing either > implementation to match the other is a regression. > > Doubling down on (a) over (b) seems strongly inferior to me, the API > cannot support this without introducing runtime overhead for all callers > (a new iterpool in the dir object), so I'd rather fix this properly with > a new API. > > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at > the same as introducing _pread() if desired - making the strdup only > happen for _pread would only be a minor tweak, not a whole new subpool. > There is example of apr_hash_t.ITERATOR that is statically allocated in the hash object and returned to the caller. Also the native readdir() API behaves the same way [1]: [[[ The returned pointer, and pointers within the structure, might be invalidated or the structure or the storage areas might be overwritten by a subsequent call to readdir() on the same directory stream. ]]] From this perspective the current behavior of apr_dir_read on Unix looks inconsistent and surprising for the API user because it is impossible to use in a loop without unbounded memory usage. I strongly agree that the revved version of this function that accepts a POOL argument will be good from both usage and implementation terms. I would be +1 on adding such API. But I also think that unless we fix the original issue by using the preallocated buffer/iterpool, the whole thing would still be problematic for any existing API user. More detailed: if we take this opportunity to choose and document the API to consistently return a temporary result, then as the caller I can either process the data during iteration or manually put it into a long-living pool if that is necessary. I think that all current callers work with the API this way. And I can also switch to the new revved function to manually specify the pool and better control the allocations. If I would like to support both APR 1.x and 2.x the compatibility is also straightforward to achieve. If we instead choose an option where the results may be allocated in the dir object pool, then as the caller I cannot avoid unbounded memory usage when supporting both APR 1.x and 2.x. And this approach also introduces the unavoidable the unbounded memory usage for all Win32 users that do not update their code. [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475 -- Ivan Zhakov